gms_preprocessing package

Subpackages

Submodules

gms_preprocessing.version module

Module contents

class gms_preprocessing.ProcessController(job_ID, **config_kwargs)[source]

Bases: object

gms_preprocessing process controller

Parameters
  • job_ID – job ID belonging to a valid database record within table ‘jobs’

  • config_kwargs – keyword arguments to be passed to gms_preprocessing.set_config()

property DB_job_record
L1A_processing()[source]

Run Level 1A processing: Data import and metadata homogenization

L1B_processing()[source]

Run Level 1B processing: calculation of geometric shifts

L1C_processing()[source]

Run Level 1C processing: atmospheric correction

L2A_processing()[source]

Run Level 2A processing: geometric homogenization

L2B_processing()[source]

Run Level 2B processing: spectral homogenization

L2C_processing()[source]

Run Level 2C processing: accurracy assessment and MGRS tiling

add_local_availability(datasets)[source]

Check availability of all subsets per scene and processing level.

NOTE: The processing level of those scenes, where not all subsystems are available in the same processing level

is reset.

Parameters

datasets (List[OrderedDict]) – List of one OrderedDict per subsystem as generated by CFG.data_list

Return type

List[OrderedDict]

benchmark()[source]

Run a benchmark.

clear_lists_procObj()[source]
create_job_summary()[source]

Create job success summary

get_DB_objects(procLvl, prevLvl_objects=None, parallLev=None, blocksize=None)[source]

Returns a list of GMS objects for datasets available on disk that have to be processed by the current processor.

Parameters
  • procLvl – <str> processing level oof the current processor

  • prevLvl_objects – <list> of in-mem GMS objects produced by the previous processor

  • parallLev – <str> parallelization level (‘scenes’ or ‘tiles’) -> defines if full cubes or blocks are to be returned

  • blocksize – <tuple> block size in case blocks are to be returned, e.g. (2000,2000)

Returns

property logger
run_all_processors(custom_data_list=None, serialize_after_each_mapper=False)[source]

Run all processors at once.

property sceneids_failed
shutdown()[source]

Shutdown the process controller instance (loggers, remove temporary directories, …).

stop(signum, frame)[source]

Interrupt the running process controller gracefully.

update_DB_job_record()[source]

Update the database records of the current job (table ‘jobs’).

update_DB_job_statistics(usecase_datalist)[source]

Update job statistics of the running job in the database.

gms_preprocessing.set_config(job_ID, json_config='', reset_status=False, **kwargs)[source]

Set up a configuration for a new gms_preprocessing job!

Parameters
  • job_ID – job ID of the job to be executed, e.g. 123456 (must be present in database)

  • json_config – path to JSON file containing configuration parameters or a string in JSON format

  • reset_status – whether to reset the job status or not (default=False)

  • kwargs

    keyword arguments to be passed to JobConfig NOTE: All keyword arguments given here WILL OVERRIDE configurations that have been

    previously set via WebUI or via the json_config parameter!

Keyword Arguments
  • inmem_serialization: False: write intermediate results to disk in order to save memory

    True: keep intermediate results in memory in order to save IO time

  • parallelization_level: <str> choices: ‘scenes’ - parallelization on scene-level

    ‘tiles’ - parallelization on tile-level

  • db_host: host name of the server that runs the postgreSQL database

  • spatial_index_server_host: host name of the server that runs the SpatialIndexMediator

  • spatial_index_server_port: port used for connecting to SpatialIndexMediator

  • delete_old_output: <bool> whether to delete previously created output of the given job ID

    before running the job (default = False)

  • exec_L1AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L1BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L1CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2AP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2BP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • exec_L2CP: list of 3 elements: [run processor, write output, delete output if not needed anymore]

  • CPUs: number of CPU cores to be used for processing (default: None -> use all available)

  • allow_subMultiprocessing:

    allow multiprocessing within workers

  • disable_exception_handler:

    enable/disable automatic handling of unexpected exceptions (default: True -> enabled)

  • log_level: the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’;

    default: ‘INFO’)

  • tiling_block_size_XY:

    X/Y block size to be used for any tiling process (default: (2048,2048)

  • is_test: whether the current job represents a software test job (run by a test runner) or not

    (default=False)

  • profiling: enable/disable code profiling (default: False)

  • benchmark_global:

    enable/disable benchmark of the whole processing pipeline

  • path_procdata_scenes:

    output path to store processed scenes

  • path_procdata_MGRS:

    output path to store processed MGRS tiles

  • path_archive: input path where downloaded data are stored

  • virtual_sensor_id: 1: Landsat-8, 10: Sentinel-2A 10m

  • datasetid_spatial_ref: 249 Sentinel-2A

Return type

JobConfig