gms_preprocessing package¶

Subpackages¶

Submodules¶

gms_preprocessing.version module¶

Module contents¶

class gms_preprocessing.ProcessController(job_ID, **config_kwargs)[source]¶

Bases: object

gms_preprocessing process controller

Parameters

job_ID – job ID belonging to a valid database record within table ‘jobs’
config_kwargs – keyword arguments to be passed to gms_preprocessing.set_config()

property DB_job_record¶

L1A_processing()[source]¶: Run Level 1A processing: Data import and metadata homogenization

L1B_processing()[source]¶: Run Level 1B processing: calculation of geometric shifts

L1C_processing()[source]¶: Run Level 1C processing: atmospheric correction

L2A_processing()[source]¶: Run Level 2A processing: geometric homogenization

L2B_processing()[source]¶: Run Level 2B processing: spectral homogenization

L2C_processing()[source]¶: Run Level 2C processing: accurracy assessment and MGRS tiling

add_local_availability(datasets)[source]¶

Check availability of all subsets per scene and processing level.

NOTE: The processing level of those scenes, where not all subsystems are available in the same processing level: is reset.

Parameters: datasets (List[OrderedDict]) – List of one OrderedDict per subsystem as generated by CFG.data_list
Return type: List[OrderedDict]

benchmark()[source]¶: Run a benchmark.

clear_lists_procObj()[source]¶

create_job_summary()[source]¶: Create job success summary

get_DB_objects(procLvl, prevLvl_objects=None, parallLev=None, blocksize=None)[source]¶

Returns a list of GMS objects for datasets available on disk that have to be processed by the current processor.

Parameters

procLvl – <str> processing level oof the current processor
prevLvl_objects – <list> of in-mem GMS objects produced by the previous processor
parallLev – <str> parallelization level (‘scenes’ or ‘tiles’) -> defines if full cubes or blocks are to be returned
blocksize – <tuple> block size in case blocks are to be returned, e.g. (2000,2000)

Returns

property logger¶

run_all_processors(custom_data_list=None, serialize_after_each_mapper=False)[source]¶: Run all processors at once.

property sceneids_failed¶

shutdown()[source]¶: Shutdown the process controller instance (loggers, remove temporary directories, …).

stop(signum, frame)[source]¶: Interrupt the running process controller gracefully.

update_DB_job_record()[source]¶: Update the database records of the current job (table ‘jobs’).

update_DB_job_statistics(usecase_datalist)[source]¶: Update job statistics of the running job in the database.

gms_preprocessing.set_config(job_ID, json_config='', reset_status=False, **kwargs)[source]¶

Set up a configuration for a new gms_preprocessing job!

Parameters

job_ID – job ID of the job to be executed, e.g. 123456 (must be present in database)
json_config – path to JSON file containing configuration parameters or a string in JSON format
reset_status – whether to reset the job status or not (default=False)
kwargs –
keyword arguments to be passed to JobConfig NOTE: All keyword arguments given here WILL OVERRIDE configurations that have been

previously set via WebUI or via the json_config parameter!

Keyword Arguments

inmem_serialization: False: write intermediate results to disk in order to save memory
True: keep intermediate results in memory in order to save IO time
parallelization_level: <str> choices: ‘scenes’ - parallelization on scene-level
‘tiles’ - parallelization on tile-level
db_host: host name of the server that runs the postgreSQL database
spatial_index_server_host: host name of the server that runs the SpatialIndexMediator
spatial_index_server_port: port used for connecting to SpatialIndexMediator
delete_old_output: <bool> whether to delete previously created output of the given job ID
before running the job (default = False)
exec_L1AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
exec_L1BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
exec_L1CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
exec_L2AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
exec_L2BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
exec_L2CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]
CPUs: number of CPU cores to be used for processing (default: None -> use all available)
allow_subMultiprocessing:
allow multiprocessing within workers
disable_exception_handler:
enable/disable automatic handling of unexpected exceptions (default: True -> enabled)
log_level: the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’;
default: ‘INFO’)
tiling_block_size_XY:
X/Y block size to be used for any tiling process (default: (2048,2048)
is_test: whether the current job represents a software test job (run by a test runner) or not
(default=False)
profiling: enable/disable code profiling (default: False)
benchmark_global:
enable/disable benchmark of the whole processing pipeline
path_procdata_scenes:
output path to store processed scenes
path_procdata_MGRS:
output path to store processed MGRS tiles
path_archive: input path where downloaded data are stored
virtual_sensor_id: 1: Landsat-8, 10: Sentinel-2A 10m
datasetid_spatial_ref: 249 Sentinel-2A

Return type

JobConfig