gms_preprocessing package



gms_preprocessing.version module

Module contents

class gms_preprocessing.ProcessController(job_ID, **config_kwargs)[source]

Bases: object

gms_preprocessing process controller

  • job_ID – job ID belonging to a valid database record within table ‘jobs’

  • config_kwargs – keyword arguments to be passed to gms_preprocessing.set_config()

property DB_job_record

Run Level 1A processing: Data import and metadata homogenization


Run Level 1B processing: calculation of geometric shifts


Run Level 1C processing: atmospheric correction


Run Level 2A processing: geometric homogenization


Run Level 2B processing: spectral homogenization


Run Level 2C processing: accurracy assessment and MGRS tiling


Check availability of all subsets per scene and processing level.

NOTE: The processing level of those scenes, where not all subsystems are available in the same processing level

is reset.


datasets (List[OrderedDict]) – List of one OrderedDict per subsystem as generated by CFG.data_list

Return type



Run a benchmark.


Create job success summary

get_DB_objects(procLvl, prevLvl_objects=None, parallLev=None, blocksize=None)[source]

Returns a list of GMS objects for datasets available on disk that have to be processed by the current processor.

  • procLvl – <str> processing level oof the current processor

  • prevLvl_objects – <list> of in-mem GMS objects produced by the previous processor

  • parallLev – <str> parallelization level (‘scenes’ or ‘tiles’) -> defines if full cubes or blocks are to be returned

  • blocksize – <tuple> block size in case blocks are to be returned, e.g. (2000,2000)


property logger
run_all_processors(custom_data_list=None, serialize_after_each_mapper=False)[source]

Run all processors at once.

property sceneids_failed

Shutdown the process controller instance (loggers, remove temporary directories, …).

stop(signum, frame)[source]

Interrupt the running process controller gracefully.


Update the database records of the current job (table ‘jobs’).


Update job statistics of the running job in the database.

gms_preprocessing.set_config(job_ID, json_config='', reset_status=False, **kwargs)[source]

Set up a configuration for a new gms_preprocessing job!

  • job_ID – job ID of the job to be executed, e.g. 123456 (must be present in database)

  • json_config – path to JSON file containing configuration parameters or a string in JSON format

  • reset_status – whether to reset the job status or not (default=False)

  • kwargs

    keyword arguments to be passed to JobConfig NOTE: All keyword arguments given here WILL OVERRIDE configurations that have been

    previously set via WebUI or via the json_config parameter!

Keyword Arguments
  • inmem_serialization: False: write intermediate results to disk in order to save memory

    True: keep intermediate results in memory in order to save IO time

  • parallelization_level: <str> choices: ‘scenes’ - parallelization on scene-level

    ‘tiles’ - parallelization on tile-level

  • db_host: host name of the server that runs the postgreSQL database

  • spatial_index_server_host: host name of the server that runs the SpatialIndexMediator

  • spatial_index_server_port: port used for connecting to SpatialIndexMediator

  • delete_old_output: <bool> whether to delete previously created output of the given job ID

    before running the job (default = False)

  • exec_L1AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L1BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L1CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • CPUs: number of CPU cores to be used for processing (default: None -> use all available)

  • allow_subMultiprocessing:

    allow multiprocessing within workers

  • disable_exception_handler:

    enable/disable automatic handling of unexpected exceptions (default: True -> enabled)

  • log_level: the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’;

    default: ‘INFO’)

  • tiling_block_size_XY:

    X/Y block size to be used for any tiling process (default: (2048,2048)

  • is_test: whether the current job represents a software test job (run by a test runner) or not


  • profiling: enable/disable code profiling (default: False)

  • benchmark_global:

    enable/disable benchmark of the whole processing pipeline

  • path_procdata_scenes:

    output path to store processed scenes

  • path_procdata_MGRS:

    output path to store processed MGRS tiles

  • path_archive: input path where downloaded data are stored

  • virtual_sensor_id: 1: Landsat-8, 10: Sentinel-2A 10m

  • datasetid_spatial_ref: 249 Sentinel-2A

Return type