gms_preprocessing.misc package

Submodules

gms_preprocessing.misc.database_tools module

class gms_preprocessing.misc.database_tools.GMS_JOB(conn_db)[source]

Bases: object

gms_preprocessing job manager

Parameters

conn_db (str) – <str> the database connection parameters as given by CFG.conn_params

create()[source]

Add the job to the ‘jobs’ table of the database :rtype: int :return: <int> the job ID of the newly created job

property db_entry

Returns an OrderedDict containing keys and values of the database entry.

delete_procdata_of_entire_job(proc_level='all', force=False)[source]

Deletes all scene data processed by the current job ID.

Parameters
  • proc_level – <str> delete only results that have the given processing level

  • force

delete_procdata_of_failed_sceneIDs(proc_level='all', force=False)[source]

Deletes all data where processing failed within the current job ID.

Parameters
  • proc_level – <str> delete only results that have the given processing level

  • force

from_dictlist(dictlist_data2process, virtual_sensor_id, datasetid_spatial_ref=249, comment=None)[source]
Parameters

dictlist_data2process (list) – <list> a list of dictionaries containing the keys “satellite”, “sensor” and “filenames”, e.g. [{‘satellite:’Landsat-8,’sensor’:’OLI_TIRS’,’filenames’:file.tar.gz},{…}]

:param virtual_sensor_id : <int> a valid ID from the ‘virtual_sensors’ table of the postgreSQL database :type datasetid_spatial_ref: int :param datasetid_spatial_ref: <int> a valid dataset ID of the dataset to be chosen as spatial reference

(from the ‘datasets’ table of the postgreSQL database) (default:249 - Sentinel-2A), 104=Landsat-8

Parameters

comment (Optional[str]) – <str> a comment describing the job (e.g. ‘Beta job’)

Return type

GMS_JOB

from_entityIDlist(list_entityids, virtual_sensor_id, datasetid_spatial_ref=249, comment=None)[source]

Create a GMS_JOB instance based on the given list of entity IDs.

Parameters
  • list_entityids

  • virtual_sensor_id

  • datasetid_spatial_ref

  • comment

Returns

from_filenames(list_filenames, virtual_sensor_id, datasetid_spatial_ref=249, comment=None)[source]

Create a GMS_JOB instance based on the given list of provider archive filenames.

Parameters
  • list_filenames

  • virtual_sensor_id

  • datasetid_spatial_ref

  • comment

Returns

from_job_ID(job_ID)[source]

Create a GMS_JOB instance by querying the database for a specific job ID. :type job_ID: int :param job_ID: <int> a valid id from the database table ‘jobs’

Return type

GMS_JOB

from_sceneIDlist(list_sceneIDs, virtual_sensor_id, datasetid_spatial_ref=249, comment=None)[source]

Create a GMS_JOB instance based on the given list of scene IDs.

Parameters

list_sceneIDs (list) – <list> of scene IDs, e.g. [26781907, 26781917, 26542650, 26542451, 26541679]

:param virtual_sensor_id : <int> a valid ID from the ‘virtual_sensors’ table of the postgreSQL database :type datasetid_spatial_ref: int :param datasetid_spatial_ref: <int> a valid dataset ID of the dataset to be chosen as spatial reference

(from the ‘datasets’ table of the postgreSQL database) (default:249 - Sentinel-2A), 104=Landsat-8

Parameters

comment (Optional[str]) – <str> a comment describing the job (e.g. ‘Beta job’)

Return type

object

id

int

reset_job_progress()[source]

Resets everthing in the database entry that has been written during the last run of the job..

update_db_entry()[source]

Updates the all values of current database entry belonging to the respective job ID. New values are taken from the attributes of the GMS_JOB instance.

property virtualsensorid
gms_preprocessing.misc.database_tools.add_externally_downloaded_data_to_GMSDB(conn_DB, src_folder, filenames, satellite, sensor)[source]

Adds externally downloaded satellite scenes to GMS fileserver AND updates the corresponding postgreSQL records by adding a filename and setting the processing level to ‘DOWNLOADED’.:

Parameters
  • conn_DB (str) – <str> pgSQL database connection parameters

  • src_folder (str) – <str> the source directory where externally provided archive files are saved

  • filenames (list) – <list> a list of filenames to be added to the GMS database

  • satellite (str) – <str> the name of the satellite to which the filenames are belonging

  • sensor (str) – <str> the name of the sensor to which the filenames are belonging

Return type

None

gms_preprocessing.misc.database_tools.add_missing_filenames_in_pgSQLdb(conn_params)[source]
gms_preprocessing.misc.database_tools.append_item_to_arrayCol_in_postgreSQLdb(conn_params, tablename, vals2append_dict, cond_dict=None, timeout=15000)[source]

Queries a postgreSQL database for the given parameters and appends the given value to the specified column of the query result.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be updated

  • vals2append_dict (dict) – <dict> a dictionary containing keys and value(s) to be set in the form {‘col_name’:[<value>,<value>]}

  • cond_dict (Optional[dict]) – <dict> a dictionary containing the query conditions in the form {‘column_name’:<value>} HINT: <value> can also be a list or a tuple of elements to match

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Optional[str]

gms_preprocessing.misc.database_tools.archive_exists_on_fileserver(conn_DB, entityID)[source]

Queries the postgreSQL database for the archive filename of the given entity ID and checks if the corresponding archive file exists in the archive folder.

Parameters
  • conn_DB (str) – <str> pgSQL database connection parameters

  • entityID (str) – <str> entity ID to be checked

Return type

bool

gms_preprocessing.misc.database_tools.create_record_in_postgreSQLdb(conn_params, tablename, vals2write_dict, timeout=15000)[source]

Creates a single new record in a postgreSQL database and pupulates its columns with the given values.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be updated

  • vals2write_dict (dict) – <dict> a dictionary containing keys and values to be set in the form {‘col_name’:<value>}

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Union[int, str]

gms_preprocessing.misc.database_tools.data_DB_updater(obj_dict)[source]

Updates the table “scenes_proc” or “mgrs_tiles_proc within the postgreSQL database according to the given dictionary of a GMS object.

Parameters

obj_dict (dict) – <dict> a copy of the dictionary of the respective GMS object

Return type

None

gms_preprocessing.misc.database_tools.delete_processing_results(scene_ID, proc_level='all', force=False)[source]

Deletes the processing results of a given scene ID

Parameters
  • scene_ID – <int> the scene ID to delete results from

  • proc_level – <str> delete only results that have the given processing level

  • force – <bool> force deletion without user interaction

gms_preprocessing.misc.database_tools.delete_record_in_postgreSQLdb(conn_params, tablename, record_id, timeout=15000)[source]

Delete a single record in a postgreSQL database.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be updated

  • record_id (dict) – <dict> ID of the record to be deleted

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Union[int, str]

gms_preprocessing.misc.database_tools.execute_pgSQL_query(cursor, query_command)[source]

Executes a postgreSQL query catches the full error message if there is one.

gms_preprocessing.misc.database_tools.get_dict_satellite_name_id(conn_params)[source]

Returns a dictionary with satellite names as keys and satellite IDs as values as read from pgSQL database.

Parameters

conn_params (str) – <str> pgSQL database connection parameters

Return type

dict

gms_preprocessing.misc.database_tools.get_dict_sensor_name_id(conn_params)[source]

Returns a dictionary with sensor names as keys and sensor IDs as values as read from pgSQL database. :type conn_params: str :param conn_params: <str> pgSQL database connection parameters

Return type

dict

gms_preprocessing.misc.database_tools.get_entityIDs_from_filename(conn_DB, filename)[source]

Returns entityID(s) for the given filename. In case of Sentinel-2 there can be more multiple entity IDs if multiple granules are saved in one .zip file.

Parameters
  • conn_DB (str) – <str> pgSQL database connection parameters

  • filename (str) – <str> the filename to get the corresponding entity ID(s) for

Return type

list

gms_preprocessing.misc.database_tools.get_filename_by_entityID(conn_DB, entityid, satellite)[source]

Returns the filename for the given entity ID.

Parameters
  • conn_DB (str) – <str> pgSQL database connection parameters

  • entityid (str) – <str> entity ID

  • satellite (str) – <str> satellite name to which the entity ID is belonging

Return type

str

gms_preprocessing.misc.database_tools.get_info_from_postgreSQLdb(conn_params, tablename, vals2return, cond_dict=None, records2fetch=0, timeout=15000)[source]

Queries a postgreSQL database for the given parameters.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be queried

  • vals2return (Union[list, str]) – <list or str> a list of strings containing the column titles of the values to be returned

  • cond_dict (Optional[dict]) –

    <dict> a dictionary containing the query conditions in the form {‘column_name’:<value>} HINT: <value> can also be a list or a tuple of elements to match, BUT note that the order

    of the list items is NOT respected!

  • records2fetch (int) – <int> number of records to be fetched (default=0: fetch unlimited records)

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Union[list, str]

gms_preprocessing.misc.database_tools.get_notDownloadedsceneIDs(conn_DB, entityIDs, satellite, sensor, src_folder)[source]

Takes a list of entity IDs and extracts those records that have the corresponding archive file in the given source folder and that have the processing level ‘METADATA’ in the pgSQL database. Based on this subset a numpy array containing the corresponding scene IDs and the target filenames for the fileserver is returned.

Parameters
  • conn_DB (str) – <str> pgSQL database connection parameters

  • entityIDs (list) – <list> a list of entity IDs

  • satellite (str) – <str> the name of the satellite to restrict the query on

  • sensor (str) – <str> the name of the sensor to restrict the query on

  • src_folder (str) – <str> the source directory where archive files are saved

Return type

ndarray

gms_preprocessing.misc.database_tools.get_overlapping_MGRS_tiles(conn_params, scene_ID=None, tgt_corners_lonlat=None, timeout=15000)[source]

In contrast to pgSQL ‘Overlapping’ here means that both geometries share some spatial area. So it combines ST_Overlaps and ST_Contains.

gms_preprocessing.misc.database_tools.get_overlapping_MGRS_tiles2(conn_params, scene_ID=None, tgt_corners_lonlat=None, timeout=15000)[source]
gms_preprocessing.misc.database_tools.get_overlapping_scenes_from_postgreSQLdb(conn_params, table='scenes_proc', scene_ID=None, tgt_corners_lonlat=None, conditions=None, add_cmds='', timeout=15000)[source]

Queries the postgreSQL database in order to find those scenes of a specified reference satellite (Landsat-8 or Sentinel-2) that have an overlap to the given corner coordinates AND that fulfill the given conditions.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • table (str) – <str> name of the table within the database to be updated

  • scene_ID (Optional[int]) – <int> a sceneID to get the target geographical extent from (needed if tgt_corners_lonlat is not provided)

  • tgt_corners_lonlat (Optional[list]) – <list> a list of coordinates defining the target geographical extent (needed if scene_ID is not provided)

  • conditions (Union[list, str, None]) – <list> a list of additional query conditions

  • add_cmds (str) – <str> additional pgSQL commands to be added to the pgSQL query

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Union[list, str]

gms_preprocessing.misc.database_tools.get_pgSQL_geospatial_query_cond(conn_params, table2query, geomCol2use='bounds', tgt_corners_lonlat=None, scene_ID=None, queryfunc='ST_Intersects', crossing_dateline_check=True)[source]
gms_preprocessing.misc.database_tools.get_postgreSQL_matchingExp(key, value)[source]

Converts a key/value pair to a postgreSQL matching expression in the form “column=value” respecting postgreSQL type casts. The resulting string can be directly inserted into a postgreSQL query.

Return type

str

gms_preprocessing.misc.database_tools.get_postgreSQL_value(value)[source]

Converts Python variable to a postgreSQL value respecting postgreSQL type casts. The resulting value can be directly inserted into a postgreSQL query.

Return type

str

gms_preprocessing.misc.database_tools.get_scene_and_dataset_infos_from_postgreSQLdb(sceneid)[source]

Creates an OrderedDict containing further information about a given scene ID by querying the pgSQL database.

Parameters

sceneid (int) – <int> the GMS scene ID to get information for

Return type

OrderedDict

gms_preprocessing.misc.database_tools.import_shapefile_into_postgreSQL_database(path_shp, tablename, cols2import=None, dtype_dic=None, if_exists='fail', index_label=None, primarykey=None)[source]

Imports all features of shapefile into the specified table of the postgreSQL database. Geometry is automatically converted to postgreSQL geometry data type. :type path_shp: str :param path_shp: <str> path of the shapefile to be imported :type tablename: str :param tablename: <str> name of the table within the postgreSQL database where records shall be added :type cols2import: Optional[list] :param cols2import: <list> a list of column names to be imported :type dtype_dic: Optional[dict] :param dtype_dic: <dict> a dictionary of column names and corresponding postgreSQL types

The types should be a SQLAlchemy or GeoSQLAlchemy2 type, or a string for sqlite3 fallback connection.

Parameters
  • if_exists (str) – <str> {‘fail’, ‘replace’, ‘append’} the action to be executed if target table already exists

  • index_label (Optional[str]) – <str> Column label for index column(s).

  • primarykey (Optional[str]) – <str> the name of the column to be set as primary key of the target table

Return type

None

gms_preprocessing.misc.database_tools.increment_decrement_arrayCol_in_postgreSQLdb(conn_params, tablename, col2update, idx_val2decrement=None, idx_val2increment=None, cond_dict=None, timeout=15000)[source]

Updates an array column of a specific postgreSQL table in the form that it increments or decrements the elements at a given position. HINT: The column must have values like that: [52,0,27,10,8,0,0,0,0]

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be update

  • col2update (str) – <str> column name of the column to be updated

  • idx_val2decrement (Optional[int]) – <int> the index of the array element to be decremented (starts with 1)

  • idx_val2increment (Optional[int]) – <int> the index of the array element to be incremented (starts with 1)

  • cond_dict (Optional[dict]) – <dict> a dictionary containing the query conditions in the form {‘column_name’:<value>} HINT: <value> can also be a list or a tuple of elements to match

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Optional[str]

Returns

gms_preprocessing.misc.database_tools.pdDataFrame_to_sql_k(engine, frame, name, if_exists='fail', index=True, index_label=None, schema=None, chunksize=None, dtype=None, **kwargs)[source]

Extends the standard function pandas.io.SQLDatabase.to_sql() with ‘kwargs’ which allows to set the primary key of the target table for example. This is usually not possible with the standard to_sql() function.

Parameters
  • engine (any) – SQLAlchemy engine (created by sqlalchemy.create_engine)

  • frame (DataFrame) – the pandas.DataFrame or geopandas.GeoDataFrame to be exported to SQL-like database

  • name (str) – <str> Name of SQL table

  • if_exists (str) – <str> {‘fail’, ‘replace’, ‘append’} the action to be executed if target table already exists

  • index (bool) – <bool> Write DataFrame index as a column.

  • index_label (Optional[str]) – <str> Column label for index column(s).

  • schema (Optional[str]) – <str> Specify the schema (if database flavor supports this). If None, use default schema.

  • chunksize (Optional[int]) – <int> If not None, then rows will be written in batches of this size at a time. If None, all rows will be written at once.

  • dtype (Optional[dict]) – <dict> a dictionary of column names and corresponding postgreSQL types The types should be a SQLAlchemy or GeoSQLAlchemy2 type,

  • kwargs (any) – keyword arguments to be passed to SQLTable

Return type

None

gms_preprocessing.misc.database_tools.postgreSQL_table_to_csv(conn_db, path_csv, tablename)[source]
gms_preprocessing.misc.database_tools.record_stats_memusage(conn_db, GMS_obj)[source]
Return type

bool

gms_preprocessing.misc.database_tools.remove_item_from_arrayCol_in_postgreSQLdb(conn_params, tablename, vals2remove_dict, cond_dict=None, timeout=15000)[source]

Queries a postgreSQL database for the given parameters and removes the given value from the specified column of the query result.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be updated

  • vals2remove_dict (dict) – <dict> a dictionary containing keys and value(s) to be set in the form {‘col_name’:[<value>,<value>]}

  • cond_dict (Optional[dict]) – <dict> a dictionary containing the query conditions in the form {‘column_name’:<value>} HINT: <value> can also be a list or a tuple of elements to match

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Optional[str]

gms_preprocessing.misc.database_tools.update_records_in_postgreSQLdb(conn_params, tablename, vals2update_dict, cond_dict=None, timeout=15000)[source]

Queries a postgreSQL database for the given parameters and updates the given columns of the query result.

Parameters
  • conn_params (str) – <str> connection parameters as provided by CFG.conn_params

  • tablename (str) – <str> name of the table within the database to be updated

  • vals2update_dict (dict) – <dict> a dictionary containing keys and values to be set in the form {‘col_name’:<value>}

  • cond_dict (Optional[dict]) – <dict> a dictionary containing the query conditions in the form {‘column_name’:<value>} HINT: <value> can also be a list or a tuple of elements to match

  • timeout (int) – <int> allows to set a custom statement timeout (milliseconds)

Return type

Optional[str]

gms_preprocessing.misc.definition_dicts module

gms_preprocessing.misc.definition_dicts.datasetid_to_sat_sen(dsid)[source]
Return type

tuple

gms_preprocessing.misc.definition_dicts.get_GMS_sensorcode(GMS_id)[source]
Return type

str

gms_preprocessing.misc.definition_dicts.get_mask_classdefinition(maskname, satellite)[source]
gms_preprocessing.misc.definition_dicts.get_mask_colormap(maskname)[source]
gms_preprocessing.misc.definition_dicts.get_outFillZeroSaturated(dtype)[source]

Returns the values for ‘fill-‘, ‘zero-‘ and ‘saturated’ pixels of an image to be written with regard to the target data type.

Parameters

dtype – data type of the image to be written

gms_preprocessing.misc.definition_dicts.is_dataset_provided_as_fullScene(GMS_id)[source]
Returns True if the dataset belonging to the given GMS_identifier is provided as full scene and returns False if

it is provided as multiple tiles.

Parameters

GMS_id (GMS_identifier) –

Return type

bool

Returns

gms_preprocessing.misc.definition_dicts.sat_sen_to_datasetid(satellite, sensor)[source]
Return type

int

gms_preprocessing.misc.environment module

class gms_preprocessing.misc.environment.GMSEnvironment(logger=None)[source]

Bases: object

GeoMultiSens Environment class.

check_dependencies()[source]
check_ecmwf_api_creds()[source]
static check_paths()[source]
check_ports()[source]
static check_read_write_permissions()[source]
static ensure_properly_activated_GDAL()[source]

gms_preprocessing.misc.exception_handler module

class gms_preprocessing.misc.exception_handler.ExceptionHandler(logger=None)[source]

Bases: object

property exc_details
static get_sample_GMS_obj(GMS_objs)[source]
Return type

Union[GMS_object, failed_GMS_object]

handle_failed()[source]
increment_progress()[source]

Update statistics column in jobs table of postgreSQL database.

NOTE: This function ONLY receives those GMS_objects that have been sucessfully processed by the GMS_mapper.

static is_failed(GMS_objs)[source]
log_uncaught_exceptions(GMS_mapper)[source]

Decorator function for handling unexpected exceptions that occurr within GMS mapper functions. Traceback is sent to logfile of the respective GMS object and the scene ID is added to the ‘failed_sceneids’ column within the jobs table of the postgreSQL database.

Parameters

GMS_mapper – A GMS mapper function that takes a GMS object, does some processing and returns it back.

property logger
static update_progress_failed(failed_Obj, procL_failed=None)[source]

Update statistics column in jobs table of postgreSQL database.

Parameters
  • failed_Obj – instance of gms_object failed_GMS_object

  • procL_failed – processing level to be decremented. If not given, the one from failed_Obj is used.

Returns

update_progress_started()[source]

in case of just initialized objects: update statistics column in jobs table of postgreSQL database to ‘started’

gms_preprocessing.misc.exception_handler.ignore_warning(warning_type)[source]

A decorator to ignore a specific warning when executing a function.

Parameters

warning_type – the type of the warning to ignore

gms_preprocessing.misc.exception_handler.log_uncaught_exceptions(GMS_mapper, logger=None)[source]
gms_preprocessing.misc.exception_handler.trace_unhandled_exceptions(func)[source]

gms_preprocessing.misc.exceptions module

exception gms_preprocessing.misc.exceptions.ACNotSupportedError[source]

Bases: RuntimeError

An error raised if there is currently no AC supported for the current sensor.

exception gms_preprocessing.misc.exceptions.FmaskError[source]

Bases: RuntimeError

An error within the Fmask wrapper of gms_preprocessing.

exception gms_preprocessing.misc.exceptions.FmaskWarning[source]

Bases: UserWarning

A warning within the Fmask wrapper of gms_preprocessing.

exception gms_preprocessing.misc.exceptions.GMSConfigParameterError[source]

Bases: ValueError

A wrong config parameter has been passed to GMS configuration.

exception gms_preprocessing.misc.exceptions.GMSEnvironmentError[source]

Bases: OSError

Missing package, that has not been automatically installed because it is not pip-installable.

exception gms_preprocessing.misc.exceptions.MissingNonPipLibraryWarning[source]

Bases: UserWarning

Missing package, that has not been automatically installed because it is not pip-installable.

gms_preprocessing.misc.helper_functions module

Collection of helper functions for GeoMultiSens.

gms_preprocessing.misc.helper_functions.CornerLonLat_to_shapelyPoly(CornerLonLat)[source]

Returns a shapely.Polygon() object based on the given coordinate list.

gms_preprocessing.misc.helper_functions.ENVIfile_to_ENVIcompressed(inPath_hdr, outPath_hdr=None)[source]
class gms_preprocessing.misc.helper_functions.Landsat_entityID_decrypter(entityID)[source]

Bases: object

SatDict = {'C8': 'Landsat-8', 'E7': 'Landsat-7', 'M1': 'Landsat-1', 'O8': 'Landsat-8', 'T4': 'Landsat-4', 'T5': 'Landsat-5', 'T8': 'Landsat-8'}
SenDict = {'C8': 'OLI_TIRS', 'E7': 'ETM+', 'M1': 'MSS1', 'O8': 'OLI', 'T4': 'TM', 'T5': 'TM', 'T8': 'TIRS'}
decrypt()[source]

LXSPPPRRRYYYYDDDGSIVV

gms_preprocessing.misc.helper_functions.convert_absPathArchive_to_GDALvsiPath(path_archive)[source]
gms_preprocessing.misc.helper_functions.cornerLonLat_to_postgreSQL_poly(CornerLonLat)[source]

Converts a coordinate list [UL_LonLat, UR_LonLat, LL_LonLat, LR_LonLat] to a postgreSQL polygon. :param CornerLonLat: list of XY-coordinate tuples

gms_preprocessing.misc.helper_functions.find_in_xml(xml, *branch)[source]

S2 xml helper function :param xml: xml object :param branch: iterate to branches using find :return: xml object, None if nothing was found

gms_preprocessing.misc.helper_functions.find_in_xml_root(namespace, xml_root, branch, *branches, findall=None)[source]

S2 xml helper function, search from root. Get part of xml. :param namespace: :param xml_root: :param branch: first branch, is combined with namespace :param branches: repeated find’s along these parameters :param findall: if given, at final a findall :return: found xml object, None if nothing was found

gms_preprocessing.misc.helper_functions.get_UL_LR_from_shapefile_features(path_shp)[source]

Returns a list of upper-left-lower-right coordinates ((ul,lr) tuples) for all features of a given shapefile.

Parameters

path_shp (str) – <str> the path of the shapefile

Return type

list

gms_preprocessing.misc.helper_functions.get_arrSubsetBounds_from_shapelyPolyLonLat(arr_shape, shpPolyLonLat, im_gt, im_prj, pixbuffer=0, ensure_valid_coords=True)[source]

Returns validated image coordinates, corresponding to the given shapely polygon. This function can be used to get the image coordines of e.g. MGRS tiles for a specific target image.

Parameters
  • arr_shape (tuple) – <tuple of ints> the dimensions of the target image -> (rows, cols,bands) or (rows,cols)

  • shpPolyLonLat (Polygon) – <tuple of floats> the shapely polygon to get image coordinates for

  • im_gt (list) – <tuple> GDAL geotransform of the target image

  • im_prj (str) – <str> GDAL geographic projection (WKT string) of the target image (automatic reprojection is done if neccessary)

  • pixbuffer (float) – <float> an optional buffer size (image pixel units)

  • ensure_valid_coords (bool) – <bool> whether to ensure that the returned values are all inside the original image bounding box

Return type

tuple

gms_preprocessing.misc.helper_functions.get_imageCoords_from_shapelyPoly(shapelyPoly, im_gt)[source]

Converts each vertex coordinate of a shapely polygon into image coordinates corresponding to the given geotransform without respect to invalid image coordinates. Those must be filtered later.

Parameters
  • shapelyPoly (Polygon) – <shapely.Polygon>

  • im_gt (list) – <list> the GDAL geotransform of the target image

Return type

list

gms_preprocessing.misc.helper_functions.get_parentObjDict()[source]
gms_preprocessing.misc.helper_functions.get_valid_arrSubsetBounds(arr_shape, tgt_bounds, buffer=0)[source]

Validates a given tuple of image coordinates, by checking if each coordinate is within a given bounding box and replacing invalid coordinates by valid ones. This function is needed in connection with get_arrSubsetBounds_from_shapelyPolyLonLat().

Parameters
  • arr_shape (tuple) – <tuple of ints> the dimension of the bounding box where target coordinates are validated -> (rows, cols,bands) or (rows,cols)

  • tgt_bounds (tuple) – <tuple of floats> the target image coordinates in the form (xmin, xmax, ymin, ymax)

  • buffer (float) – <float> an optional buffer size (image pixel units)

Return type

tuple

gms_preprocessing.misc.helper_functions.get_values_from_xml(leaf, dtype=<class 'float'>)[source]

S2 xml helper function :param leaf: xml object which is searched for VALUES tag which are then composed into a numpy array :param dtype: dtype of returned numpy array :return: numpy array

gms_preprocessing.misc.helper_functions.get_zipfile_namelist(path_zipfile)[source]
gms_preprocessing.misc.helper_functions.group_dicts_by_key(dict_list, key)[source]
gms_preprocessing.misc.helper_functions.group_objects_by_attributes(object_list, *attributes)[source]
gms_preprocessing.misc.helper_functions.group_tuples_by_keys_of_tupleElements(tuple_list, tupleElement_index, key)[source]
gms_preprocessing.misc.helper_functions.gzipfile(iname, oname, compression_level=1, blocksize=None)[source]
gms_preprocessing.misc.helper_functions.is_proc_level_lower(current_lvl, target_lvl)[source]

Return True if current_lvl is lower than target_lvl.

Parameters
  • current_lvl (str) – current processing level (to be tested)

  • target_lvl (str) – target processing level (refernce)

Return type

bool

class gms_preprocessing.misc.helper_functions.mp_SharedNdarray(dims)[source]

Bases: object

wrapper class, which collect all neccessary instances to make a numpy ndarray accessible as shared memory when using multiprocessing, it exposed the numpy array via three different views which can be used to access it globally

_init provides the mechanism to make this array available in each worker, best used using the provided __initializer__

dims : tuple of dimensions which is used to instantiate a ndarray using np.zero

gms_preprocessing.misc.helper_functions.mp_initializer(globals, globs)[source]

globs shall be dict with name:value pairs, when executed value will be added to globals under the name name, if value provides a _init attribute this one is called instead.

This makes most sense when called as initializer in a multiprocessing pool, e.g.: Pool(initializer=__initializer__,initargs=(globs,)) :param globals: :param globs:

gms_preprocessing.misc.helper_functions.postgreSQL_geometry_to_postgreSQL_poly(geom)[source]
Return type

str

gms_preprocessing.misc.helper_functions.postgreSQL_geometry_to_shapelyPolygon(wkb_hex)[source]
gms_preprocessing.misc.helper_functions.postgreSQL_poly_to_cornerLonLat(pGSQL_poly)[source]

Converts a postgreSQL polygon to a coordinate list [UL_LonLat, UR_LonLat, LL_LonLat, LR_LonLat]. :type pGSQL_poly: str :param pGSQL_poly:

Return type

list

gms_preprocessing.misc.helper_functions.reorder_CornerLonLat(CornerLonLat)[source]

Reorders corner coordinate lists from [UL,UR,LL,LR] to clockwise order: [UL,UR,LR,LL]

gms_preprocessing.misc.helper_functions.safe_str(obj)[source]

Return a safe string that will not cause any UnicodeEncodeError issues.

gms_preprocessing.misc.helper_functions.sceneID_to_trueDataCornerLonLat(scene_ID)[source]

Returns a list of corner coordinates ordered like (UL,UR,LL,LR) corresponding to the given scene_ID by querying the database geometry field.

gms_preprocessing.misc.helper_functions.scene_ID_to_shapelyPolygon(scene_ID)[source]

Returns a LonLat shapely.Polygon() object corresponding to the given scene_ID.

Return type

Polygon

gms_preprocessing.misc.helper_functions.shapelyPolygon_to_postgreSQL_geometry(shapelyPoly)[source]
Return type

str

gms_preprocessing.misc.helper_functions.silentmkdir(path_dir_file)[source]
Return type

None

gms_preprocessing.misc.helper_functions.silentremove(filename)[source]

Remove the given file without raising OSError exceptions, e.g. if the file does not exist.

Return type

None

gms_preprocessing.misc.helper_functions.sorted_nicely(iterable)[source]

Sort the given iterable in the way that humans expect. :param iterable:

gms_preprocessing.misc.helper_functions.stack_detectors(inp)[source]
gms_preprocessing.misc.helper_functions.subcall_with_output(cmd, no_stdout=False, no_stderr=False)[source]

Execute external command and get its stdout, exitcode and stderr. :param cmd: a normal shell command including parameters

gms_preprocessing.misc.helper_functions.subplot_2dline(XY_tuples, titles=None, shapetuple=None, grid=False)[source]
gms_preprocessing.misc.helper_functions.subplot_3dsurface(ims, shapetuple=None)[source]
gms_preprocessing.misc.helper_functions.subplot_imshow(ims, titles=None, shapetuple=None, grid=False)[source]

gms_preprocessing.misc.locks module

class gms_preprocessing.misc.locks.DatabaseLock(allowed_slots=1, logger=None, **kwargs)[source]

Bases: gms_preprocessing.misc.locks.SharedResourceLock

class gms_preprocessing.misc.locks.IOLock(allowed_slots=1, logger=None, **kwargs)[source]

Bases: gms_preprocessing.misc.locks.SharedResourceLock

class gms_preprocessing.misc.locks.MemoryReserver(mem2lock_gb, max_usage=90, logger=None)[source]

Bases: object

Parameters

mem2lock_gb – Amount of memory to be reserved during the lock is acquired (gigabytes).

acquire(timeout=20)[source]
property acquisition_key
delete()[source]
property mem_reserved_gb
release()[source]
property reserved_key
property reserved_key_jobID
property usable_memory_gb
property waiting
property waiting_key
property waiting_key_jobID
class gms_preprocessing.misc.locks.MultiSlotLock(name='MultiSlotLock', allowed_slots=1, logger=None, **kwargs)[source]

Bases: redis_semaphore.Semaphore

acquire(timeout=0, target=None)[source]
delete()[source]
release()[source]
class gms_preprocessing.misc.locks.ProcessLock(allowed_slots=1, logger=None, **kwargs)[source]

Bases: gms_preprocessing.misc.locks.SharedResourceLock

class gms_preprocessing.misc.locks.SharedResourceLock(name='MultiSlotLock', allowed_slots=1, logger=None, **kwargs)[source]

Bases: gms_preprocessing.misc.locks.MultiSlotLock

acquire(timeout=0, target=None)[source]
delete()[source]
property grabbed_key_jobID
release_all_jobID_tokens()[source]
signal(token)[source]
gms_preprocessing.misc.locks.acquire_process_lock(**processlock_kwargs)[source]

Decorator function for ProcessLock.

Parameters

processlock_kwargs – Keyword arguments to be passed to ProcessLock class.

gms_preprocessing.misc.locks.release_unclosed_locks()[source]
gms_preprocessing.misc.locks.reserve_mem(**memlock_kwargs)[source]

Decorator function for MemoryReserver.

Parameters

memlock_kwargs – Keyword arguments to be passed to MemoryReserver class.

gms_preprocessing.misc.logging module

class gms_preprocessing.misc.logging.GMS_logger(name_logfile, fmt_suffix=None, path_logfile=None, log_level='INFO', append=True, log_to_joblog=True)[source]

Bases: logging.Logger

Returns a logging.logger instance pointing to the given logfile path. :type name_logfile: str :param name_logfile: :type fmt_suffix: Optional[any] :param fmt_suffix: if given, it will be included into log formatter :type path_logfile: Optional[str] :param path_logfile: if no path is given, only a StreamHandler is created :type log_level: any :param log_level: the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’;

default: ‘INFO’)

Parameters
  • append (bool) – <bool> whether to append the log message to an existing logfile (1) or to create a new logfile (0); default=1

  • log_to_joblog (bool) – whether to additionally log all messages to the logfile of the GMS job (default=1)

property captured_stream
close()[source]
view_logfile()[source]
class gms_preprocessing.misc.logging.LessThanFilter(exclusive_maximum, name='')[source]

Bases: logging.Filter

Filter class to filter log messages by a maximum log level.

Based on http://stackoverflow.com/questions/2302315/

how-can-info-and-debug-logging-message-be-sent-to-stdout-and-higher-level-messag

Get an instance of LessThanFilter.

Parameters
  • exclusive_maximum – maximum log level, e.g., logger.WARNING

  • name

filter(record)[source]

Filter funtion.

NOTE: Returns True if logging level of the given record is below the maximum log level.

Parameters

record

Returns

bool

gms_preprocessing.misc.logging.close_logger(logger)[source]
gms_preprocessing.misc.logging.shutdown_loggers()[source]

gms_preprocessing.misc.path_generator module

gms_preprocessing.misc.path_generator.get_path_ac_options(GMS_id)[source]

Returns the path of the options json file needed for atmospheric correction.

Return type

Union[str, None]

gms_preprocessing.misc.path_generator.get_path_cloud_class_obj(GMS_id, get_all=False)[source]

Returns the absolute path of the the training data used by cloud classifier. :param GMS_id: :param get_all:

gms_preprocessing.misc.path_generator.get_path_snr_model(GMS_id)[source]

Returns the absolute path of the SNR model for the given sensor.

Parameters

GMS_id (GMS_identifier) –

Return type

str

gms_preprocessing.misc.path_generator.get_tempfile(ext=None, prefix=None, tgt_dir=None)[source]

Returns the path to a tempfile.mkstemp() file that can be passed to any function that expects a physical path. The tempfile has to be deleted manually. :param ext: file extension (None if None) :param prefix: optional file prefix :param tgt_dir: target directory (automatically set if None)

class gms_preprocessing.misc.path_generator.path_generator(*args, **kwargs)[source]

Bases: object

Methods return absolute paths corresponding to the input object. To be instanced with the dict of a L1A/L1B/… object or a list with the attributes below. If ‘scene_ID’ (integer) is passed to kwargs, all eventually given args are ignored. Instead they are retrieved from postgreSQLdb.

get_baseN(merged_subsystems=False)[source]

Returns the basename belonging to the given scene.

Parameters

merged_subsystems – if True, a subsystem is not included in the returned basename (usefor for merged subsystems in L2A+)

get_local_archive_path_baseN()[source]

Returns the path of the downloaded raw data archive, e.g. ‘/path/to/file/file.tar.gz’.

get_outPath_hdr(attrName2write)[source]

Returns the output path for the given attribute to be written. :type attrName2write: str :param attrName2write: <str> name of the GMS object attribute to be written

Return type

str

get_path_ac_input_dump()[source]

Returns the path of the .dill for a dump of atmospheric correction inputs, e.g. ‘/path/to/file/file.dill’.

get_path_accuracylayers()[source]

Returns the path of the _accuracy_layers_.bsq file, e.g., ‘/path/to/file/file_accuracy_layers_L2C.bsq’.

NOTE: Accuracy layers are only present in L2C.

get_path_cloudmaskdata()[source]

Returns the path of the _mask_clouds_.bsq file belonging to the given processing level, e.g. ‘/path/to/file/file_mask_clouds_L1A.bsq’.

get_path_gmsfile()[source]

Returns the path of the .gms file belonging to the given processing level, e.g. ‘/path/to/file/file.gms’.

get_path_imagedata()[source]

Returns the path of the .bsq file belonging to the given processing level, e.g. ‘/path/to/file/file.bsq’.

get_path_logfile(merged_subsystems=False)[source]

Returns the path of the logfile belonging to the given scene, e.g. ‘/path/to/file/file.log’.

Parameters

merged_subsystems – if True, a subsystem is not included in the returned logfile path (usefor for merged subsystems in L2A+)

get_path_maskdata()[source]

Returns the path of the _masks_.bsq file belonging to the given processing level, e.g. ‘/path/to/file/file_masks_L1A.bsq’.

get_path_procdata()[source]

Returns the target folder of all processed data for the current scene.

get_path_rawdata()[source]

Returns the folder of all downloaded data for the current scene.

get_path_tempdir()[source]
get_pathes_all_procdata()[source]

gms_preprocessing.misc.spatial_index_mediator module

class gms_preprocessing.misc.spatial_index_mediator.Connection(host, port, timeout)[source]

Bases: object

Connection to the spatial index mediator server

DISCONNECT_MSG = 6

message value for a disconnect message

HELLO_MSG = 1

message value for a “hello” message

disconnect()[source]

Closes the connection to the index mediator server.

No further communication, like placing queries will be possible.

recvBuffer(buffer, numBytes)[source]
recvByte()[source]
recvInt()[source]
writeByte(byte)[source]
class gms_preprocessing.misc.spatial_index_mediator.Scene(sceneid, acquisitiondate, cloudcover, proclevel, daynight, bounds)[source]

Bases: object

Scene Metadata class

Parameters
  • sceneid – database sceneid, e.g. 26366229

  • acquisitiondate – acquisition date of the scene as datetime instance, e.g. 2016-03-25 10:15:26

  • cloudcover – cloudcover value of the scene, e.g. 11

  • daynight – day/night indicator (0=unknown, 1=day, 2=night)

  • bounds – scene bounds as list of lat/lon wgs84 coordinates (lon1, lat1, lon2, lat2, …), e.g. (10.00604, 49.19385, 7.45638, 49.64513, 8.13739, 51.3515, 10.77705, 50.89307)

class gms_preprocessing.misc.spatial_index_mediator.SpatialIndexMediator(host='localhost', port=8654, timeout=5.0, retries=10)[source]

Bases: object

Establishes a connection to the spatial index mediator server.

Parameters
  • host – host address of the index mediator server (default “localhost”)

  • port – port number of the index mediator server (default 8654)

  • timeout – timeout as float in seconds (default 5.0 sec)

  • retries – number of retries in case of timeout

FULL_SCENE_QUERY_MSG = 3

message value for a full scene query message

getFullSceneDataForDataset(envelope, timeStart, timeEnd, minCloudCover, maxCloudCover, datasetid, dayNight=0, refDate=None, maxDaysDelta=None)[source]

Query the spatial index with the given parameters in order to get a list of matching scenes intersecting the given envelope

Parameters
  • envelope (list) – list of left, right and low, up coordinates (in lat/lon wgs84) of the region of interest in the form of (min_lon, max_lon, min_lat, max_lat), e.g. envelope = (10.0, 16.0, 50.0, 60.0)

  • timeStart (datetime) – start timestamp of the relevant timerange as datetime instance, e.g., datetime(2015, 1, 1)

  • timeEnd (datetime) – end timestamp of the relevant timerange as datetime instance, e.g. datetime(2016, 6, 15)

  • minCloudCover (float) – minimum cloudcover in percent, e.g. 12, will return scenes with cloudcover >= 12% only

  • maxCloudCover (float) – maximum cloudcover in percent, e.g. 23, will return scenes with cloudcover <= 23% only

  • datasetid (int) – datasetid of the dataset in question, e.g. 104 for Landsat-8

:param dayNight day/night indicator, with (0 = both, 1 = day, 2 = night) :type refDate: Optional[datetime] :param refDate: reference timestamp as datetime instance, e.g. datetime(2015, 1, 1) [optional] :type maxDaysDelta: Optional[int] :param maxDaysDelta: maximum allowed number of days the target scenes might be apart from the given refDate

[optional]

Return type

List[Scene]

class gms_preprocessing.misc.spatial_index_mediator.SpatialIndexMediatorServer(rootDir, logger=None)[source]

Bases: object

property is_running
property process_id
restart()[source]
start()[source]
property status

Check server status.

Return running(bool)

running or not?

Return process_id(int)

stop()[source]

Module contents