spechomo package

Submodules

spechomo.classifier module

class spechomo.classifier.ClassifierCollection(path_dillFile)[source]

Bases: object

class spechomo.classifier.Cluster_Learner(dict_clust_MLinstances, global_classifier)[source]

Bases: object

A class that holds the machine learning classifiers for multiple spectral clusters as well as a global classifier.

These classifiers can be applied to an input sensor image by using the predict method.

Get an instance of Cluster_Learner.

Parameters
  • dict_clust_MLinstances (Union[dict, ClassifierCollection]) – a dictionary of cluster specific machine learning classifiers

  • global_classifier (any) – the global machine learning classifier to be applied at image positions with high spectral dissimilarity to the available cluster centers

classmethod from_disk(classifier_rootDir, method, n_clusters, src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA, n_estimators=50)[source]

Read a previously saved ClusterLearner from disk and return a ClusterLearner instance.

Describe the classifier specifications with the given arguments.

Parameters
  • classifier_rootDir (str) – root directory of the classifiers

  • method (str) – harmonization method ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)

  • n_clusters (int) – number of clusters

  • src_satellite (str) – source satellite, e.g., ‘Landsat-8’

  • src_sensor (str) – source sensor, e.g., ‘OLI_TIRS’

  • src_LBA (list) – source LayerBandsAssignment

  • tgt_satellite (str) – target satellite, e.g., ‘Landsat-8’

  • tgt_sensor (str) – target sensor, e.g., ‘OLI_TIRS’

  • tgt_LBA (list) – target LayerBandsAssignment

  • n_estimators (int) – number of estimators (only used in case of method==’RFR’

Return type

Cluster_Learner

Returns

classifier instance loaded from disk

plot_sample_spectra(cluster_label='all', include_mean_spectrum=True, include_median_spectrum=True, ncols=5, **kw_fig)[source]
Return type

plt.figure

predict(im_src, cmap, nodataVal=None, cmap_nodataVal=None, cmap_unclassifiedVal=- 1)[source]

Predict target satellite spectral information using separate prediction coefficients for spectral clusters.

Parameters
  • im_src (Union[ndarray, GeoArray]) – input image to be used for prediction

  • cmap (ndarray) – classification map that assigns each image spectrum to a corresponding cluster -> must be a 2D np.ndarray with the same X-/Y-dimension like im_src

  • nodataVal (Union[int, float, None]) – nodata value to be used to fill into the predicted image

  • cmap_nodataVal (Union[int, float, None]) – nodata class value of the nodata class of the classification map

  • cmap_unclassifiedVal (Union[int, float]) – ‘unclassified’ class value of the nodata class of the classification map

Return type

ndarray

Returns

predict_weighted_averages(im_src, cmap_3D, weights_3D=None, nodataVal=None, cmap_nodataVal=None, cmap_unclassifiedVal=- 1)[source]

Predict target satellite spectral information using separate prediction coefficients for spectral clusters.

NOTE: This version of the prediction function uses the prediction coefficients of multiple spectral clusters

and computes the result as weighted average of them. Therefore, the classification map must assign multiple spectral clusters to each input pixel.

# NOTE: At unclassified pixels (cmap_3D[y,x,z>0] == -1) the prediction result using global coefficients # is ignored in the weighted average. In that case the prediction result is based on the found valid # spectral clusters and is not affected by the global coefficients (should improve prediction results).

Parameters
  • im_src (Union[ndarray, GeoArray]) – input image to be used for prediction

  • cmap_3D (ndarray) – classification map that assigns each image spectrum to multiple corresponding clusters -> must be a 3D np.ndarray with the same X-/Y-dimension like im_src

  • weights_3D (Optional[ndarray]) –

  • nodataVal (Union[int, float, None]) – nodata value to be used to fill into the predicted image

  • cmap_nodataVal (Union[int, float, None]) – nodata class value of the nodata class of the classification map

  • cmap_unclassifiedVal (Union[int, float]) – ‘unclassified’ class value of the nodata class of the classification map

Return type

ndarray

Returns

print_stats()[source]
save_to_json(filepath)[source]
to_jsonable_dict()[source]

Create a dictionary containing a JSONable replicate of the current Cluster_Learner instance.

spechomo.classifier.classifier_from_json_str(json_str)[source]

Create a spectral harmonization classifier from a JSON string (JSON de-serialization).

Parameters

json_str – the JSON string to be used for de-serialization

Returns

spechomo.classifier.classifier_to_jsonable_dict(clf, skipkeys=None, include_typesdict=False)[source]
spechomo.classifier.get_jsonable_value(in_value, return_typesdict=False)[source]

spechomo.classifier_creation module

class spechomo.classifier_creation.ClusterClassifier_Generator(list_refcubes, logger=None)[source]

Bases: object

Class for creating collections of machine learning classifiers that can be used for spectral homogenization.

Get an instance of Classifier_Generator.

Parameters
  • list_refcubes (List[Union[str, RefCube]]) – list of RefCube instances or paths for which the classifiers are to be created.

  • logger (Optional[Logger]) – instance of logging.Logger()

create_classifiers(outDir, method='LR', n_clusters=50, sam_classassignment=False, CPUs=None, max_distance='80%', max_angle=5, **kwargs)[source]

Create cluster classifiers for all combinations of the reference cubes given in __init__().

Parameters
  • outDir (str) – output directory for the created cluster classifier collections

  • method (str) – type of machine learning classifiers to be included in classifier collections ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees with maximum depth of 3 by default)

  • n_clusters (int) – number of clusters to be used for KMeans clustering

  • sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers

  • CPUs (Optional[int]) – number of CPUs to be used for KMeans clustering

  • max_distance (Union[int, str]) – maximum spectral distance allowed during filtering of training spectra - if given as string, e.g., ‘80%’ excludes the worst 20 % of the input spectra

  • max_angle (Union[int, str]) – maximum spectral angle allowed during filtering of training spectra - if given as string, e.g., ‘80%’ excludes the worst 20 % of the input spectra

  • kwargs (dict) – keyword arguments to be passed to machine learner

Return type

None

static train_machine_learner(train_X, train_Y, test_X, test_Y, method, **kwargs)[source]

Use the given train and test data to train a machine learner and append some accuracy statistics.

Parameters
  • train_X (np.ndarray) – reference training data

  • train_Y (np.ndarray) – target training data

  • test_X (np.ndarray) – reference test data

  • test_Y (np.ndarray) – target test data

  • method (str) – type of machine learning classifiers to be included in classifier collections ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees)

  • kwargs (dict) – keyword arguments to be passed to the __init__() function of machine learners

Return type

Union[LinearRegression, Ridge, Pipeline, RandomForestRegressor]

class spechomo.classifier_creation.ReferenceCube_Generator(filelist_refs, tgt_sat_sen_list=None, dir_refcubes='', n_clusters=10, tgt_n_samples=1000, v=False, logger=None, CPUs=None, dir_clf_dump='')[source]

Bases: object

Class for creating reference cube that are later used as training data for SpecHomo_Classifier.

Initialize ReferenceCube_Generator.

Parameters
  • filelist_refs (List[str]) – list of (hyperspectral) reference images, representing BOA reflectance, scaled between 0 and 10000

  • tgt_sat_sen_list (Optional[List[Tuple[str, str]]]) – list satellite/sensor tuples containing those sensors for which the reference cube is to be computed, e.g. [(‘Landsat-8’, ‘OLI_TIRS’,), (‘Landsat-5’, ‘TM’)]

  • dir_refcubes (str) – output directory for the generated reference cube

  • n_clusters (int) – number of clusters to be used for clustering the input images (KMeans)

  • tgt_n_samples (int) – number o spectra to be collected from each input image

  • v (bool) – verbose mode

  • logger (Optional[Logger]) – instance of logging.Logger()

  • CPUs (Optional[int]) – number CPUs to use for computation

  • dir_clf_dump (str) – directory where to store the serialized KMeans classifier

cluster_image_and_get_uniform_spectra(im, downsamp_sat=None, downsamp_sen=None, basename_clf_dump='', try_read_dumped_clf=True, sam_classassignment=False, max_distance='80%', max_angle=6, nmin_unique_spectra=50, progress=False)[source]

Compute KMeans clusters for the given image and return the an array of uniform random samples.

Parameters
  • im (Union[str, GeoArray, ndarray]) – image to be clustered

  • downsamp_sat (Optional[str]) – satellite code used for intermediate image dimensionality reduction (input image is spectrally resampled to this satellite before it is clustered). requires downsamp_sen. If it is None, no intermediate downsampling is performed.

  • downsamp_sen (Optional[str]) – sensor code used for intermediate image dimensionality reduction (requires downsamp_sat)

  • basename_clf_dump (str) – basename of serialized KMeans classifier

  • try_read_dumped_clf (bool) – try to read a previously serialized KMeans classifier from disk (massively speeds up the RefCube generation)

  • sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers

  • max_distance (int) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster

  • max_angle (Union[int, float, str]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster

  • nmin_unique_spectra (Union[int, float, str]) – in case a cluster has less than the given number, do not include it in the reference cube (default: 50)

  • progress (bool) – whether to show progress bars or not

Return type

ndarray

Returns

2D array (rows: tgt_n_samples, columns: spectral information / bands

generate_reference_cubes(fmt_out='ENVI', try_read_dumped_clf=True, sam_classassignment=False, max_distance='80%', max_angle=6, nmin_unique_spectra=50, alg_nodata='radical', progress=True)[source]

Generate reference spectra from all hyperspectral input images.

Workflow: 1. Clustering/classification of hyperspectral images and selection of a given number of random signatures

(a. Spectral downsamling to lower spectral resolution (speedup)) b. KMeans clustering c. Selection of the same number of signatures from each cluster to avoid unequal amount of training data.

  1. Spectral resampling of the selected hyperspectral signatures (for each input image)

  2. Add resampled spectra to reference cubes for each target sensor and write cubes to disk

Parameters
  • fmt_out (str) – output format (GDAL driver code)

  • try_read_dumped_clf (bool) – try to read a prediciouly serialized KMeans classifier from disk (massively speeds up the RefCube generation)

  • sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers

  • max_distance (int) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster

  • max_angle (Union[int, float, str]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster

  • nmin_unique_spectra (Union[int, float, str]) – in case a cluster has less than the given number, do not include it in the reference cube (default: 50)

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band

    ’radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

    (might alter the output spectral information,

    e.g., at spectral absorption bands)

  • progress (bool) – show progress bar (default: True)

Return type

ReferenceCube_Generator.refcubes # noqa

Returns

np.array: [tgt_n_samples x images x spectral bands of the target sensor]

property refcubes

Return a dict holding instances of RefCube for each target satellite / sensor of self.tgt_sat_sen_list.

Return type

Dict[Tuple[str, str]: RefCube]

resample_image_spectrally(src_im, tgt_rsr, src_nodata=None, alg_nodata='radical', progress=False)[source]

Perform spectral resampling of the given image to match the given spectral response functions.

Parameters
  • src_im (Union[str, GeoArray]) – source image to be resampled

  • tgt_rsr (RelativeSpectralResponse) – target relative spectral response functions to be used for spectral resampling

  • src_nodata (Union[int, float, None]) – source image nodata value

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata (might alter the output

    spectral information, e.g., at spectral absorption bands)

  • progress (bool) – show progress bar (default: false)

Return type

Optional[GeoArray]

Returns

resample_spectra(spectra, src_cwl, tgt_rsr, nodataVal, alg_nodata='radical')[source]

Perform spectral resampling of the given image to match the given spectral response functions.

Parameters
  • spectra (Union[GeoArray, ndarray]) – 2D array (rows: spectral samples; columns: spectral information / bands

  • src_cwl (Union[list, array]) – central wavelength positions of input spectra

  • tgt_rsr (RelativeSpectralResponse) – target relative spectral response functions to be used for spectral resampling

  • nodataVal (int) – nodata value of the given spectra to be ignored during resampling

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

    (might alter the outpur spectral information,

    e.g., at spectral absorption bands)

Return type

ndarray

Returns

spechomo.classifier_creation.get_filename_classifier_collection(method, src_satellite, src_sensor, n_clusters=1, **cls_kwinit)[source]
spechomo.classifier_creation.get_machine_learner(method='LR', **init_params)[source]

Get an instance of a machine learner.

Parameters
  • method (str) – ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees)

  • init_params (dict) – parameters to be passed to __init__() function of the returned machine learner model.

Return type

Union[LinearRegression, Ridge, Pipeline]

spechomo.clustering module

class spechomo.clustering.KMeansRSImage(im, n_clusters, sam_classassignment=False, CPUs=1, v=False)[source]

Bases: object

Class for clustering a given input image by using K-Means algorithm.

NOTE: Based on the nodata value of the input GeoArray those pixels that have nodata values in some bands are

ignored when computing the cluster coefficients. Nodata values would affect clustering result otherwise.

apply_clusters(image)[source]
property clustermap
property clusters
Return type

KMeans

compute_clusters(nmax_spectra=100000)[source]

Compute the cluster means and labels.

Parameters

nmax_spectra – maximum number of spectra to be included (pseudo-randomly selected (reproducable))

Returns

static compute_euclidian_distance_2D(spectra, endmembers)[source]
static compute_euclidian_distance_for_labelled_spectra(spectra, labels, endmembers)[source]
compute_spectral_angles()[source]
compute_spectral_distances()[source]
dump(path_out)[source]
classmethod from_disk(path_clf, im)[source]

Get an instance of KMeansRSImage from a previously saved classifier.

Parameters
  • path_clf (str) – path of serialzed classifier (dill file)

  • im (GeoArray) – path of the image cube belonging to that classifier

Return type

KMeansRSImage

Returns

KMeansRSImage

get_purest_spectra_from_each_cluster(samplesize=50)[source]

Return a given number of spectra directly surrounding the center of each cluster.

E.g., 50 spectra belonging to cluster 1, 50 spectra belonging to cluster 2 and so on.

Parameters

samplesize (int) – number of spectra to be selected from each cluster

Return type

dict

Returns

get_random_spectra_from_each_cluster(samplesize=50, max_distance=None, max_angle=None, nmin_unique_spectra=50)[source]

Return a given number of spectra randomly selected within each cluster.

E.g., 50 spectra belonging to cluster 1, 50 spectra belonging to cluster 2 and so on.

Parameters
  • samplesize (int) – number of spectra to be randomly selected from each cluster

  • max_distance (Union[int, float, str, None]) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster

  • max_angle (Union[int, float, str, None]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster

  • nmin_unique_spectra (int) – in case a cluster has less than the given number, do not use its spectra (return missing values)

Return type

dict

Returns

property goodSpecMask
property labels

Get labels for all clustered spectra (excluding spectra that contain nodata values).

property labels_with_nodata

Get the labels for all pixels (including those containing nodata values).

property n_spectra

Get number of spectra used for clustering (excluding spectra containing nodata values).

plot_cluster_centers(figsize=(15, 5))[source]

Show a plot of the cluster center signatures.

Parameters

figsize (tuple) – figure size (inches)

Return type

None

plot_cluster_histogram(figsize=(15, 5))[source]

Show a histogram indicating the proportion of each cluster label in percent.

Parameters

figsize (tuple) – figure size (inches)

Return type

None

plot_clustermap(figsize=None)[source]

Show a the clustered image.

Parameters

figsize (Optional[tuple]) – figure size (inches)

Return type

None

save_clustermap(path_out, **kw_save)[source]
property spectra

Get spectra used for clustering (excluding spectra containing nodata values that would affect clustering).

property spectral_angles

Get spectral angles in degrees for all pixels that don’t contain nodata values.

property spectral_angles_with_nodata
property spectral_distances

Get spectral distances for all pixels that don’t contain nodata values.

property spectral_distances_with_nodata

spechomo.exceptions module

exception spechomo.exceptions.ClassifierNotAvailableError(spechomo_method, src_sat, src_sen, src_LBA, tgt_sat, tgt_sen, tgt_LBA, n_clusters)[source]

Bases: RuntimeError

spechomo.logging module

SpecHomo logging module containing logging related classes and functions.

class spechomo.logging.LessThanFilter(exclusive_maximum, name='')[source]

Bases: logging.Filter

Filter class to filter log messages by a maximum log level.

Based on http://stackoverflow.com/questions/2302315/

how-can-info-and-debug-logging-message-be-sent-to-stdout-and-higher-level-messag

Get an instance of LessThanFilter.

Parameters
  • exclusive_maximum – maximum log level, e.g., logger.WARNING

  • name

filter(record)[source]

Filter funtion.

NOTE: Returns True if logging level of the given record is below the maximum log level.

Parameters

record

Returns

bool

class spechomo.logging.SpecHomo_Logger(name_logfile, fmt_suffix=None, path_logfile=None, log_level='INFO', append=True)[source]

Bases: logging.Logger

Class for the SpecHomo logger.

Return a logging.logger instance pointing to the given logfile path.

Parameters
  • name_logfile (str) –

  • fmt_suffix (Optional[any]) – if given, it will be included into log formatter

  • path_logfile (Optional[str]) – if no path is given, only a StreamHandler is created

  • log_level (any) – the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’; default: ‘INFO’)

  • append (bool) – <bool> whether to append the log message to an existing logfile (1) or to create a new logfile (0); default=1

property captured_stream: str

Return the already captured logging stream.

NOTE:
  • set self.captured_stream:

    self.captured_stream = ‘any string’

Return type

str

close()[source]

Close all logging handlers.

view_logfile()[source]

View the log file written to disk.

spechomo.logging.close_logger(logger)[source]

Close the handlers of the given logging.Logger instance.

Parameters

logger – logging.Logger instance or subclass instance

spechomo.logging.shutdown_loggers()[source]

Shutdown any currently active loggers.

spechomo.options module

spechomo.prediction module

Main module.

class spechomo.prediction.RSImage_ClusterPredictor(method='LR', n_clusters=50, classif_alg='MinDist', classifier_rootDir='', CPUs=1, logger=None, progress=True, **kw_clf_init)[source]

Bases: object

Predictor class applying the predict() function of a machine learning classifier described by the given args.

Get an instance of RSImage_ClusterPredictor.

Parameters
  • method (str) – machine learning approach to be used for spectral bands prediction ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)

  • n_clusters (int) – Number of spectral clusters to be used during LR/ RR/ QR homogenization. E.g., 50 means that the image to be converted to the spectral target sensor is clustered into 50 spectral clusters and one separate machine learner per cluster is applied to the input data to predict the homogenized image. If ‘n_clusters’ is set to 1, the source image is not clustered and only one machine learning classifier is used for prediction.

  • classif_alg (str) – algorithm to be used for image classification (to define which cluster each pixel belongs to) ‘MinDist’: Minimum Distance (Nearest Centroid) ‘kNN’: k-nearest-neighbour ‘kNN_MinDist’: k-nearest-neighbour Minimum Distance (Nearest Centroid) ‘SAM’: spectral angle mapping ‘kNN_SAM’: k-nearest-neighbour spectral angle mapping ‘SID’: spectral information divergence ‘FEDSA’: fused euclidian distance / spectral angle ‘kNN_FEDSA’: k-nearest-neighbour fused euclidian distance / spectral angle

  • classifier_rootDir (str) – root directory where machine learning classifiers are stored.

  • CPUs (Optional[int]) – number of CPUs to use (default: 1)

  • progress (bool) – whether to show progress bars

  • logger (Optional[Logger]) – instance of logging.Logger()

:param kw_clf_init keyword arguments to be passed to classifier init functions if possible,

e.g., ‘n_neighbours’ sets the number of neighbours to be considered in kNN classification algorithms (set by ‘classif_alg’)

compute_prediction_errors(im_predicted, cluster_classifier, nodataVal=None, cmap_nodataVal=None)[source]

Compute errors that quantify prediction inaccurracy per band and per pixel.

Parameters
  • im_predicted (Union[ndarray, GeoArray]) – 3D array representing the predicted image

  • cluster_classifier (Cluster_Learner) – instance of Cluster_Learner

  • nodataVal (Optional[float]) – no data value of the input image (auto-computed if not given or contained in im_predicted GeoArray) NOTE: The value is also used as output nodata value for the errors array.

  • cmap_nodataVal (Optional[float]) – no data value for the classification map in case more than one sub-classes are used for prediction

Return type

ndarray

Returns

3D array (int16) representing prediction errors per band and pixel

get_classifier(src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA)[source]

Select the correct machine learning classifier out of previously saved classifier collections.

Describe the classifier specifications with the given arguments. :type src_satellite: str :param src_satellite: source satellite, e.g., ‘Landsat-8’ :type src_sensor: str :param src_sensor: source sensor, e.g., ‘OLI_TIRS’ :type src_LBA: list :param src_LBA: source LayerBandsAssignment :type tgt_satellite: str :param tgt_satellite: target satellite, e.g., ‘Landsat-8’ :type tgt_sensor: str :param tgt_sensor: target sensor, e.g., ‘OLI_TIRS’ :type tgt_LBA: list :param tgt_LBA: target LayerBandsAssignment :rtype: Cluster_Learner :return: classifier instance loaded from disk

predict(image, classifier, in_nodataVal=None, out_nodataVal=None, cmap_nodataVal=- 9999, global_clf_threshold=None, unclassified_pixVal=- 1)[source]

Apply the prediction function of the given specifier to the given remote sensing image.

Parameters
  • image (Union[ndarray, GeoArray]) – 3D array representing the input image

  • classifier (Cluster_Learner) – the classifier instance

  • in_nodataVal (Optional[float]) – no data value of the input image (auto-computed if not given or contained in image GeoArray)

  • out_nodataVal (Optional[float]) – no data value written into the predicted image (copied from the input image if not given)

  • cmap_nodataVal (float) – no data value for the classification map in case more than one sub-classes are used for prediction (default: -9999)

  • global_clf_threshold (Union[int, float, str, None]) – If given, all pixels where the computed similarity metric (set by ‘classif_alg’) exceeds the given threshold are predicted using the global classifier (based on a single transformation per band). - not usable for ‘kNN’ - may be given as float, integer or string to label a certain distance percentile - if given as string, it must match the format, e.g., ‘10%’ for labelling the worst 10 % of the distances as unclassified

  • unclassified_pixVal (int) – pixel value to be used in the classification map for unclassified pixels (default: -1)

Return type

GeoArray

Returns

3D array representing the predicted spectral image cube

class spechomo.prediction.SpectralHomogenizer(classifier_rootDir='', logger=None, CPUs=None, progress=True)[source]

Bases: object

Class for applying spectral homogenization by applying an interpolation or machine learning approach.

Get instance of SpectralHomogenizer.

Parameters
  • classifier_rootDir – root directory where machine learning classifiers are stored.

  • logger – instance of logging.Logger

  • progress – whether to show progress bars

interpolate_cube(arrcube, source_CWLs, target_CWLs, kind='linear')[source]

Spectrally interpolate the spectral bands of a remote sensing image to new band positions.

Parameters
  • arrcube (Union[ndarray, GeoArray]) – array to be spectrally interpolated

  • source_CWLs (list) – list of source central wavelength positions

  • target_CWLs (list) – list of target central wavelength positions

  • kind (str) – interpolation kind to be passed to scipy.interpolate.interp1d (default: ‘linear’)

Return type

GeoArray

Returns

predict_by_machine_learner(arrcube, method, src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA, n_clusters=50, classif_alg='MinDist', kNN_n_neighbors=10, global_clf_threshold='10%', src_nodataVal=None, out_nodataVal=None, compute_errors=False, bandwise_errors=True, fallback_argskwargs=None)[source]

Predict spectral bands of target sensor by applying a machine learning approach.

NOTE: You may use the function spechomo.utils.list_available_transformations() to get a list of available

transformations. You may also copy the input parameters for this method from the output there.

Parameters
  • arrcube (Union[ndarray, GeoArray]) – input image array for target sensor spectral band prediction (rows x cols x bands)

  • method (str) – machine learning approach to be used for spectral bands prediction ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)

  • src_satellite (str) – source satellite, e.g., ‘Landsat-8’

  • src_sensor (str) – source sensor, e.g., ‘OLI_TIRS’

  • src_LBA (list) – source LayerBandsAssignment # TODO document this

  • tgt_satellite (str) – target satellite, e.g., ‘Landsat-8’

  • tgt_sensor (str) – target sensor, e.g., ‘OLI_TIRS’

  • tgt_LBA (list) – target LayerBandsAssignment # TODO document this

  • n_clusters (int) – Number of spectral clusters to be used during LR/ RR/ QR homogenization. E.g., 50 means that the image to be converted to the spectral target sensor is clustered into 50 spectral clusters and one separate machine learner per cluster is applied to the input data to predict the homogenized image. If ‘spechomo_n_clusters’ is set to 1, the source image is not clustered and only one machine learning classifier is used for prediction.

  • classif_alg (str) – Multispectral classification algorithm to be used to determine the spectral cluster each pixel belongs to. ‘MinDist’: Minimum Distance (Nearest Centroid) ‘kNN’: k-nearest-neighbour ‘kNN_MinDist’: k-nearest-neighbour Minimum Distance (Nearest Centroid) ‘SAM’: spectral angle mapping ‘kNN_SAM’: k-nearest-neighbour spectral angle mapping ‘SID’: spectral information divergence ‘FEDSA’: fused euclidian distance / spectral angle ‘kNN_FEDSA’: k-nearest-neighbour fused euclidian distance / spectral angle

  • kNN_n_neighbors (int) – The number of neighbors to be considered in case ‘classif_alg’ is set to ‘kNN’. Otherwise, this parameter is ignored.

  • global_clf_threshold (Union[str, int, float]) –

    If given, all pixels where the computed similarity metric (set by ‘classif_alg’) exceeds the given threshold are predicted using the global classifier (based on a single transformation per band). - only usable for ‘MinDist’, ‘SAM’ and ‘SID’ as well as their kNN variants - may be given as float, integer or string to label a certain distance percentile - if given as string, it must match the format, e.g., ‘10%’ for labelling the

    worst 10 % of the distances as unclassified

  • src_nodataVal (Optional[int]) – no data value of source image (arrcube) - if no nodata value is set, it is tried to be auto-computed from arrcube

  • out_nodataVal (Optional[int]) – no data value of predicted image

  • compute_errors (bool) – whether to compute pixel- / bandwise model errors for estimated pixel values (default: false)

:param bandwise_errors whether to compute error information for each band separately (True - default)

or to average errors over bands using median (False) (ignored in case of fallback)

Parameters

fallback_argskwargs (Optional[dict]) – arguments and keyword arguments to be passed to the fallback algorithm SpectralHomogenizer.interpolate_cube() in case harmonization fails

Returns

predicted array (rows x columns x bands)

Return type

Tuple[np.ndarray, Union[np.ndarray, None]]

spechomo.resampling module

class spechomo.resampling.SpectralResampler(wvl_src, rsr_tgt, logger=None)[source]

Bases: object

Class for spectral resampling of a single spectral signature (1D-array) or an image (3D-array).

Get an instance of the SpectralResampler class.

Parameters
  • wvl_src (np.ndarray) – center wavelength positions of the source spectrum

  • rsr_tgt (RSR) – relative spectral response of the target instrument as an instance of pyrsr.RelativeSpectralRespnse.

resample_image(image_cube, tiledims=(20, 20), nodataVal=None, alg_nodata='radical', CPUs=None)[source]

Resample the given spectral image cube according to the spectral response functions of the target instrument.

Parameters
  • image_cube (Union[GeoArray, ndarray]) – image (3D array) containing the spectral information in the third dimension

  • tiledims (tuple) – dimension of tiles to be used during computation (rows, columns)

  • nodataVal (Union[int, float, None]) – nodata value of the input image

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

    (might alter the output spectral information,

    e.g., at spectral absorption bands)

  • CPUs (Optional[int]) – CPUs to use for processing

Return type

ndarray

Returns

resampled spectral image cube

resample_signature(spectrum, scale_factor=10000, nodataVal=None, alg_nodata='radical', v=False)[source]

Resample the given spectrum according to the spectral response functions of the target instument.

Parameters
  • spectrum (ndarray) – spectral signature data

  • scale_factor (int) – the scale factor to apply to the given spectrum when it is plotted (default: 10000)

  • nodataVal (Union[int, float, None]) – no data value to be respected during resampling

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

    (might alter the outpur spectral information,

    e.g., at spectral absorption bands)

  • v (bool) – enable verbose mode (shows a plot of the resampled spectrum) (default: False)

Return type

ndarray

Returns

resampled spectral signature

resample_spectra(spectra, chunksize=200, nodataVal=None, alg_nodata='radical', CPUs=None)[source]

Resample the given spectral signatures according to the spectral response functions of the target instrument.

Parameters
  • spectra (Union[GeoArray, ndarray]) – spectral signatures, provided as 2D array (rows: spectral samples, columns: spectral information / bands)

  • chunksize (int) – defines how many spectral signatures are resampled per CPU

  • nodataVal (Union[int, float, None]) – no data value to be respected during resampling

  • alg_nodata (str) –

    algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

    (might alter the outpur spectral information,

    e.g., at spectral absorption bands)

  • CPUs (Optional[int]) – CPUs to use for processing

Return type

ndarray

property rsr_1nm
property wvl_1nm

spechomo.training_data module

class spechomo.training_data.RefCube(filepath='', satellite='', sensor='', LayerBandsAssignment=None)[source]

Bases: object

Data model class for reference cubes holding the training data for later fitted machine learning classifiers.

Get instance of RefCube.

Parameters
  • filepath (str) – file path for importing an existing reference cube from disk

  • satellite (str) – the satellite for which the reference cube holds its spectral data

  • sensor (str) – the sensor for which the reference cube holds its spectral data

  • LayerBandsAssignment (Optional[list]) – the LayerBandsAssignment for which the reference cube holds its spectral data

add_refcube_array(refcube_array, src_imnames, LayerBandsAssignment)[source]

Add the given given array to the RefCube instance.

Parameters
  • refcube_array (Union[str, ndarray]) – 3D array or file path of the reference cube to be added (spectral samples /signatures x training images x spectral bands)

  • src_imnames (list) – list of training image file base names from which the given cube received data

  • LayerBandsAssignment (list) – LayerBandsAssignment of the spectral bands of the given 3D array

Return type

None

Returns

add_spectra(spectra, src_imname, LayerBandsAssignment)[source]

Add a set of spectral signatures to the reference cube.

Parameters
  • spectra (ndarray) – 2D numpy array with rows: spectral samples / columns: spectral information (bands)

  • src_imname (str) – image basename of the source hyperspectral image

  • LayerBandsAssignment (list) – LayerBandsAssignment for the spectral dimension of the passed spectra, e.g., [‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6L’, ‘6H’, ‘7’, ‘8’]

Return type

None

property col_imName_dict

Return an ordered dict containing the file base names of the original training images for each column.

Return type

OrderedDict

get_band_combination(tgt_LBA)[source]

Get an array according to the bands order given by a target LayerBandsAssignment.

Parameters

tgt_LBA (List[str]) – target LayerBandsAssignment

Return type

GeoArray

Returns

get_spectra_dataframe(tgt_LBA)[source]

Return a pandas.DataFrame [sample x band] according to the given LayerBandsAssignment.

Parameters

tgt_LBA (List[str]) – target LayerBandsAssignment

Return type

DataFrame

Returns

property metadata

Return an ordered dictionary holding the metadata of the reference cube.

property n_clusters

Return the number spectral clusters used for clustering source images for the reference cube.

property n_images

Return the number training images from which the reference cube contains spectral samples.

property n_signatures

Return the number spectral signatures per training image included in the reference cube.

property n_signatures_per_cluster
plot_sample_spectra(image_basename, cluster_label='all', include_mean_spectrum=True, include_median_spectrum=True, ncols=5, **kw_fig)[source]
Return type

plt.figure

read_data_from_disk(filepath)[source]
rearrange_layers(tgt_LBA)[source]

Rearrange the spectral bands of the reference cube according to the given LayerBandsAssignment.

Parameters

tgt_LBA (List[str]) – target LayerBandsAssignment

Return type

None

save(path_out, fmt='ENVI')[source]

Save the reference cube to disk.

Parameters
  • path_out (str) – output path on disk

  • fmt (str) – output format as GDAL format code

Return type

None

Returns

property wavelengths
class spechomo.training_data.TrainingData(im_X, im_Y, test_size)[source]

Bases: object

Class for analyzing statistical relations between a pair of machine learning training data cubes.

Get instance of TrainingData.

Parameters
  • im_X (Union[GeoArray, ndarray]) – input image X

  • im_Y (Union[GeoArray, ndarray]) – input image Y

  • test_size (Union[float, int]) – test size (proportion as float between 0 and 1) or absolute value as integer

plot_scatter_matrix(figsize=(15, 15), mode='intersensor')[source]
plot_scattermatrix()[source]
show_band_scatterplot(band_src_im, band_tgt_im)[source]

spechomo.utils module

spechomo.utils.download_pretrained_classifiers(method, tgt_dir='/builds/geomultisens/spechomo/spechomo/resources/classifiers')[source]
spechomo.utils.explore_classifer_dillfile(path_dillFile)[source]

List all homogenization transformations included in the given .dill file.

Parameters

path_dillFile (str) –

Return type

DataFrame

Returns

spechomo.utils.export_classifiers_as_JSON(export_rootDir, classifier_rootDir='/builds/geomultisens/spechomo/spechomo/resources/classifiers', method=None, src_sat=None, src_sen=None, src_LBA=None, tgt_sat=None, tgt_sen=None, tgt_LBA=None, n_clusters=None)[source]

Export spectral harmonization classifiers as JSON files that match the provided filtering criteria.

NOTE: So far, this function will only work for LR classifiers.

Parameters
  • export_rootDir (str) – directory where to save the exported JSON files

  • classifier_rootDir (str) – directory containing classifiers for homogenization, either as .zip archives or as .dill files

  • method (Optional[str]) – filter by the machine learning approach to be used for spectral bands prediction

  • src_sat (Optional[str]) – filter by source satellite, e.g., ‘Landsat-8’

  • src_sen (Optional[str]) – filter by source sensor, e.g., ‘OLI_TIRS’

  • src_LBA (Optional[List[str]]) – filter by source bands list

  • tgt_sat (Optional[str]) – filter by target satellite, e.g., ‘Landsat-8’

  • tgt_sen (Optional[str]) – filter by target sensor, e.g., ‘OLI_TIRS’

  • tgt_LBA (Optional[List[str]]) – filter by target bands list

  • n_clusters (Optional[int]) – filter by the number of spectral clusters to be used during LR/ RR/ QR homogenization

Return type

None

Returns

spechomo.utils.im2spectra(geoArr)[source]

Convert 3D images to array of spectra samples (rows: samples; cols: spectral information).

Return type

ndarray

spechomo.utils.list_available_transformations(classifier_rootDir='/builds/geomultisens/spechomo/spechomo/resources/classifiers', method=None, src_sat=None, src_sen=None, src_LBA=None, tgt_sat=None, tgt_sen=None, tgt_LBA=None, n_clusters=None)[source]

List all sensor transformations available according to the given classifier root directory.

NOTE: This function can be used to copy/paste possible input parameters for

spechomo.SpectralHomogenizer.predict_by_machine_learner().

Parameters
  • classifier_rootDir (str) – directory containing classifiers for homogenization, either as .zip archives or as .dill files

  • method (Optional[str]) – filter results by the machine learning approach to be used for spectral bands prediction

  • src_sat (Optional[str]) – filter results by source satellite, e.g., ‘Landsat-8’

  • src_sen (Optional[str]) – filter results by source sensor, e.g., ‘OLI_TIRS’

  • src_LBA (Optional[List[str]]) – filter results by source bands list

  • tgt_sat (Optional[str]) – filter results by target satellite, e.g., ‘Landsat-8’

  • tgt_sen (Optional[str]) – filter results by target sensor, e.g., ‘OLI_TIRS’

  • tgt_LBA (Optional[List[str]]) – filter results by target bands list

  • n_clusters (Optional[int]) – filter results by the number of spectral clusters to be used during LR/ RR/ QR homogenization

Return type

DataFrame

Returns

pandas.DataFrame listing all the available transformations

spechomo.utils.spectra2im(spectra, tgt_rows, tgt_cols)[source]

Convert array of spectra samples (rows: samples; cols: spectral information) to a 3D image.

Parameters
  • spectra (Union[GeoArray, ndarray]) – 2D array with rows: spectral samples / columns: spectral information (bands)

  • tgt_rows (int) – number of target image rows

  • tgt_cols (int) – number of target image rows

Return type

ndarray

Returns

3D array (rows x columns x spectral bands)

spechomo.version module

Module contents