gms_preprocessing.processing package

Submodules

gms_preprocessing.processing.multiproc module

gms_preprocessing.processing.multiproc.MAP(func, args, CPUs=None, flatten_output=False)[source]

Parallelize the execution of the given function. NOTE: if Job.CPUs in config is 1, execution is not parallelized.

Parameters
  • func (any) – function to parallelize

  • args (list) – function arguments

  • CPUs (Optional[int]) – number of CPUs to use

  • flatten_output (bool) – whether to flatten output list, e.g. [ [ Tile1Scene1, Tile2Scene1], Tile1Scene2, Tile2Scene2] to [ Tile1Scene1, Tile2Scene1, Tile1Scene2, Tile2Scene2 ]

Return type

list

gms_preprocessing.processing.multiproc.imap_unordered(func, args, CPUs=None, flatten_output=False)[source]

Parallelize the execution of the given function. NOTE: if Job.CPUs in config is 1, execution is not parallelized.

Parameters
  • func (any) – function to parallelize

  • args (list) – function arguments

  • CPUs (Optional[int]) – number of CPUs to use

  • flatten_output (bool) – whether to flatten output list, e.g. [ [ Tile1Scene1, Tile2Scene1], Tile1Scene2, Tile2Scene2] to [ Tile1Scene1, Tile2Scene1, Tile1Scene2, Tile2Scene2 ]

Return type

list

gms_preprocessing.processing.multiproc.is_mainprocess()[source]

Return True if the current process is the main process and False if it is a multiprocessing child process.

Return type

bool

gms_preprocessing.processing.pipeline module

gms_preprocessing.processing.pipeline.L1A_map(dataset_dict)[source]
Return type

L1A_object

gms_preprocessing.processing.pipeline.L1A_map_1(dataset_dict, block_size=None)[source]
Return type

List[L1A_object]

gms_preprocessing.processing.pipeline.L1A_map_2(L1A_tile)[source]
Return type

L1A_object

gms_preprocessing.processing.pipeline.L1A_map_3(L1A_obj)[source]
Return type

L1A_object

gms_preprocessing.processing.pipeline.L1B_map(L1A_obj)[source]

L1A_obj enthält in Python- (im Gegensatz zur inmem_serialization-) Implementierung KEINE ARRAY-DATEN!, nur die für die ganze Szene gültigen Metadaten

Return type

L1B_object

gms_preprocessing.processing.pipeline.L1C_map(L1B_objs)[source]

Atmospheric correction.

NOTE: all subsystems (containing all spatial samplings) belonging to the same scene ID are needed

Parameters

L1B_objs (Iterable[L1B_object]) – list containing one or multiple L1B objects belonging to the same scene ID.

Return type

List[L1C_object]

gms_preprocessing.processing.pipeline.L2A_map(L1C_objs, block_size=None, return_tiles=True)[source]

Geometric homogenization.

Performs correction of geometric displacements, resampling to target grid of the usecase and merges multiple GMS objects belonging to the same scene ID (subsystems) to a single L2A object. NOTE: Output L2A_object must be cut into tiles because L2A_obj size in memory exceeds maximum serialization size.

Parameters
  • L1C_objs (Union[List[L1C_object], Tuple[L1C_object]]) – list containing one or multiple L1C objects belonging to the same scene ID.

  • block_size (Optional[tuple]) – X/Y size of output tiles in pixels, e.g. (1024,1024)

  • return_tiles (bool) – return computed L2A object in tiles

Return type

Union[List[L2A_object], L2A_object]

Returns

list of L2A_object tiles

gms_preprocessing.processing.pipeline.L2B_map(L2A_obj)[source]
Return type

L2B_object

gms_preprocessing.processing.pipeline.L2C_map(L2B_obj)[source]
Return type

L2C_object

gms_preprocessing.processing.pipeline.run_complete_preprocessing(list_dataset_dicts_per_scene)[source]

NOTE: Exceptions in this function are must be catched by calling function (process controller).

Parameters

list_dataset_dicts_per_scene (List[dict]) –

Return type

Union[GMS_object, List, Generator]

Returns

gms_preprocessing.processing.process_controller module

class gms_preprocessing.processing.process_controller.ProcessController(job_ID, **config_kwargs)[source]

Bases: object

gms_preprocessing process controller

Parameters
  • job_ID – job ID belonging to a valid database record within table ‘jobs’

  • config_kwargs – keyword arguments to be passed to gms_preprocessing.set_config()

property DB_job_record
L1A_processing()[source]

Run Level 1A processing: Data import and metadata homogenization

L1B_processing()[source]

Run Level 1B processing: calculation of geometric shifts

L1C_processing()[source]

Run Level 1C processing: atmospheric correction

L2A_processing()[source]

Run Level 2A processing: geometric homogenization

L2B_processing()[source]

Run Level 2B processing: spectral homogenization

L2C_processing()[source]

Run Level 2C processing: accurracy assessment and MGRS tiling

add_local_availability(datasets)[source]

Check availability of all subsets per scene and processing level.

NOTE: The processing level of those scenes, where not all subsystems are available in the same processing level

is reset.

Parameters

datasets (List[OrderedDict]) – List of one OrderedDict per subsystem as generated by CFG.data_list

Return type

List[OrderedDict]

benchmark()[source]

Run a benchmark.

clear_lists_procObj()[source]
create_job_summary()[source]

Create job success summary

get_DB_objects(procLvl, prevLvl_objects=None, parallLev=None, blocksize=None)[source]

Returns a list of GMS objects for datasets available on disk that have to be processed by the current processor.

Parameters
  • procLvl – <str> processing level oof the current processor

  • prevLvl_objects – <list> of in-mem GMS objects produced by the previous processor

  • parallLev – <str> parallelization level (‘scenes’ or ‘tiles’) -> defines if full cubes or blocks are to be returned

  • blocksize – <tuple> block size in case blocks are to be returned, e.g. (2000,2000)

Returns

property logger
run_all_processors(custom_data_list=None, serialize_after_each_mapper=False)[source]

Run all processors at once.

property sceneids_failed
shutdown()[source]

Shutdown the process controller instance (loggers, remove temporary directories, …).

stop(signum, frame)[source]

Interrupt the running process controller gracefully.

update_DB_job_record()[source]

Update the database records of the current job (table ‘jobs’).

update_DB_job_statistics(usecase_datalist)[source]

Update job statistics of the running job in the database.

gms_preprocessing.processing.process_controller.get_job_summary(list_GMS_objects)[source]

Module contents