spechomo package

Submodules

spechomo.classifier module

class spechomo.classifier.ClassifierCollection(path_dillFile)[source]: Bases: object

class spechomo.classifier.Cluster_Learner(dict_clust_MLinstances, global_classifier)[source]

Bases: object

A class that holds the machine learning classifiers for multiple spectral clusters as well as a global classifier.

These classifiers can be applied to an input sensor image by using the predict method.

Get an instance of Cluster_Learner.

Parameters

dict_clust_MLinstances (Union[dict, ClassifierCollection]) – a dictionary of cluster specific machine learning classifiers
global_classifier (any) – the global machine learning classifier to be applied at image positions with high spectral dissimilarity to the available cluster centers

classmethod from_disk(classifier_rootDir, method, n_clusters, src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA, n_estimators=50)[source]

Read a previously saved ClusterLearner from disk and return a ClusterLearner instance.

Describe the classifier specifications with the given arguments.

Parameters

classifier_rootDir (str) – root directory of the classifiers
method (str) – harmonization method ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)
n_clusters (int) – number of clusters
src_satellite (str) – source satellite, e.g., ‘Landsat-8’
src_sensor (str) – source sensor, e.g., ‘OLI_TIRS’
src_LBA (list) – source LayerBandsAssignment
tgt_satellite (str) – target satellite, e.g., ‘Landsat-8’
tgt_sensor (str) – target sensor, e.g., ‘OLI_TIRS’
tgt_LBA (list) – target LayerBandsAssignment
n_estimators (int) – number of estimators (only used in case of method==’RFR’

Return type

Cluster_Learner

Returns

classifier instance loaded from disk

plot_sample_spectra(cluster_label='all', include_mean_spectrum=True, include_median_spectrum=True, ncols=5, **kw_fig)[source]

Return type: plt.figure

predict(im_src, cmap, nodataVal=None, cmap_nodataVal=None, cmap_unclassifiedVal=- 1)[source]

Predict target satellite spectral information using separate prediction coefficients for spectral clusters.

Parameters

im_src (Union[ndarray, GeoArray]) – input image to be used for prediction
cmap (ndarray) – classification map that assigns each image spectrum to a corresponding cluster -> must be a 2D np.ndarray with the same X-/Y-dimension like im_src
nodataVal (Union[int, float, None]) – nodata value to be used to fill into the predicted image
cmap_nodataVal (Union[int, float, None]) – nodata class value of the nodata class of the classification map
cmap_unclassifiedVal (Union[int, float]) – ‘unclassified’ class value of the nodata class of the classification map

Return type

ndarray

Returns

predict_weighted_averages(im_src, cmap_3D, weights_3D=None, nodataVal=None, cmap_nodataVal=None, cmap_unclassifiedVal=- 1)[source]

Predict target satellite spectral information using separate prediction coefficients for spectral clusters.

NOTE: This version of the prediction function uses the prediction coefficients of multiple spectral clusters: and computes the result as weighted average of them. Therefore, the classification map must assign multiple spectral clusters to each input pixel.

# NOTE: At unclassified pixels (cmap_3D[y,x,z>0] == -1) the prediction result using global coefficients # is ignored in the weighted average. In that case the prediction result is based on the found valid # spectral clusters and is not affected by the global coefficients (should improve prediction results).

Parameters

im_src (Union[ndarray, GeoArray]) – input image to be used for prediction
cmap_3D (ndarray) – classification map that assigns each image spectrum to multiple corresponding clusters -> must be a 3D np.ndarray with the same X-/Y-dimension like im_src
weights_3D (Optional[ndarray]) –
nodataVal (Union[int, float, None]) – nodata value to be used to fill into the predicted image
cmap_nodataVal (Union[int, float, None]) – nodata class value of the nodata class of the classification map
cmap_unclassifiedVal (Union[int, float]) – ‘unclassified’ class value of the nodata class of the classification map

Return type

ndarray

Returns

print_stats()[source]

save_to_json(filepath)[source]

to_jsonable_dict()[source]: Create a dictionary containing a JSONable replicate of the current Cluster_Learner instance.

spechomo.classifier.classifier_from_json_str(json_str)[source]

Create a spectral harmonization classifier from a JSON string (JSON de-serialization).

Parameters: json_str – the JSON string to be used for de-serialization
Returns

spechomo.classifier.classifier_to_jsonable_dict(clf, skipkeys=None, include_typesdict=False)[source]

spechomo.classifier.get_jsonable_value(in_value, return_typesdict=False)[source]

spechomo.classifier_creation module

class spechomo.classifier_creation.ClusterClassifier_Generator(list_refcubes, logger=None)[source]

Bases: object

Class for creating collections of machine learning classifiers that can be used for spectral homogenization.

Get an instance of Classifier_Generator.

Parameters

list_refcubes (List[Union[str, RefCube]]) – list of RefCube instances or paths for which the classifiers are to be created.
logger (Optional[Logger]) – instance of logging.Logger()

create_classifiers(outDir, method='LR', n_clusters=50, sam_classassignment=False, CPUs=None, max_distance='80%', max_angle=5, **kwargs)[source]

Create cluster classifiers for all combinations of the reference cubes given in __init__().

Parameters

outDir (str) – output directory for the created cluster classifier collections
method (str) – type of machine learning classifiers to be included in classifier collections ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees with maximum depth of 3 by default)
n_clusters (int) – number of clusters to be used for KMeans clustering
sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers
CPUs (Optional[int]) – number of CPUs to be used for KMeans clustering
max_distance (Union[int, str]) – maximum spectral distance allowed during filtering of training spectra - if given as string, e.g., ‘80%’ excludes the worst 20 % of the input spectra
max_angle (Union[int, str]) – maximum spectral angle allowed during filtering of training spectra - if given as string, e.g., ‘80%’ excludes the worst 20 % of the input spectra
kwargs (dict) – keyword arguments to be passed to machine learner

Return type

None

static train_machine_learner(train_X, train_Y, test_X, test_Y, method, **kwargs)[source]

Use the given train and test data to train a machine learner and append some accuracy statistics.

Parameters

train_X (np.ndarray) – reference training data
train_Y (np.ndarray) – target training data
test_X (np.ndarray) – reference test data
test_Y (np.ndarray) – target test data
method (str) – type of machine learning classifiers to be included in classifier collections ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees)
kwargs (dict) – keyword arguments to be passed to the __init__() function of machine learners

Return type

Union[LinearRegression, Ridge, Pipeline, RandomForestRegressor]

class spechomo.classifier_creation.ReferenceCube_Generator(filelist_refs, tgt_sat_sen_list=None, dir_refcubes='', n_clusters=10, tgt_n_samples=1000, v=False, logger=None, CPUs=None, dir_clf_dump='')[source]

Bases: object

Class for creating reference cube that are later used as training data for SpecHomo_Classifier.

Initialize ReferenceCube_Generator.

Parameters

filelist_refs (List[str]) – list of (hyperspectral) reference images, representing BOA reflectance, scaled between 0 and 10000
tgt_sat_sen_list (Optional[List[Tuple[str, str]]]) – list satellite/sensor tuples containing those sensors for which the reference cube is to be computed, e.g. [(‘Landsat-8’, ‘OLI_TIRS’,), (‘Landsat-5’, ‘TM’)]
dir_refcubes (str) – output directory for the generated reference cube
n_clusters (int) – number of clusters to be used for clustering the input images (KMeans)
tgt_n_samples (int) – number o spectra to be collected from each input image
v (bool) – verbose mode
logger (Optional[Logger]) – instance of logging.Logger()
CPUs (Optional[int]) – number CPUs to use for computation
dir_clf_dump (str) – directory where to store the serialized KMeans classifier

cluster_image_and_get_uniform_spectra(im, downsamp_sat=None, downsamp_sen=None, basename_clf_dump='', try_read_dumped_clf=True, sam_classassignment=False, max_distance='80%', max_angle=6, nmin_unique_spectra=50, progress=False)[source]

Compute KMeans clusters for the given image and return the an array of uniform random samples.

Parameters

im (Union[str, GeoArray, ndarray]) – image to be clustered
downsamp_sat (Optional[str]) – satellite code used for intermediate image dimensionality reduction (input image is spectrally resampled to this satellite before it is clustered). requires downsamp_sen. If it is None, no intermediate downsampling is performed.
downsamp_sen (Optional[str]) – sensor code used for intermediate image dimensionality reduction (requires downsamp_sat)
basename_clf_dump (str) – basename of serialized KMeans classifier
try_read_dumped_clf (bool) – try to read a previously serialized KMeans classifier from disk (massively speeds up the RefCube generation)
sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers
max_distance (int) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster
max_angle (Union[int, float, str]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster
nmin_unique_spectra (Union[int, float, str]) – in case a cluster has less than the given number, do not include it in the reference cube (default: 50)
progress (bool) – whether to show progress bars or not

Return type

ndarray

Returns

2D array (rows: tgt_n_samples, columns: spectral information / bands

generate_reference_cubes(fmt_out='ENVI', try_read_dumped_clf=True, sam_classassignment=False, max_distance='80%', max_angle=6, nmin_unique_spectra=50, alg_nodata='radical', progress=True)[source]

Generate reference spectra from all hyperspectral input images.

Workflow: 1. Clustering/classification of hyperspectral images and selection of a given number of random signatures

(a. Spectral downsamling to lower spectral resolution (speedup)) b. KMeans clustering c. Selection of the same number of signatures from each cluster to avoid unequal amount of training data.

Spectral resampling of the selected hyperspectral signatures (for each input image)
Add resampled spectra to reference cubes for each target sensor and write cubes to disk

Parameters

fmt_out (str) – output format (GDAL driver code)
try_read_dumped_clf (bool) – try to read a prediciouly serialized KMeans classifier from disk (massively speeds up the RefCube generation)
sam_classassignment (bool) – False: use minimal euclidian distance to assign classes to cluster centers True: use the minimal spectral angle to assign classes to cluster centers
max_distance (int) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster
max_angle (Union[int, float, str]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster
nmin_unique_spectra (Union[int, float, str]) – in case a cluster has less than the given number, do not include it in the reference cube (default: 50)
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band

’radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

(might alter the output spectral information,
e.g., at spectral absorption bands)
progress (bool) – show progress bar (default: True)

Return type

ReferenceCube_Generator.refcubes # noqa

Returns

np.array: [tgt_n_samples x images x spectral bands of the target sensor]

property refcubes

Return a dict holding instances of RefCube for each target satellite / sensor of self.tgt_sat_sen_list.

Return type: Dict[Tuple[str, str]: RefCube]

resample_image_spectrally(src_im, tgt_rsr, src_nodata=None, alg_nodata='radical', progress=False)[source]

Perform spectral resampling of the given image to match the given spectral response functions.

Parameters

src_im (Union[str, GeoArray]) – source image to be resampled
tgt_rsr (RelativeSpectralResponse) – target relative spectral response functions to be used for spectral resampling
src_nodata (Union[int, float, None]) – source image nodata value
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata (might alter the output

spectral information, e.g., at spectral absorption bands)
progress (bool) – show progress bar (default: false)

Return type

Optional[GeoArray]

Returns

resample_spectra(spectra, src_cwl, tgt_rsr, nodataVal, alg_nodata='radical')[source]

Perform spectral resampling of the given image to match the given spectral response functions.

Parameters

spectra (Union[GeoArray, ndarray]) – 2D array (rows: spectral samples; columns: spectral information / bands
src_cwl (Union[list, array]) – central wavelength positions of input spectra
tgt_rsr (RelativeSpectralResponse) – target relative spectral response functions to be used for spectral resampling
nodataVal (int) – nodata value of the given spectra to be ignored during resampling
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

(might alter the outpur spectral information,
e.g., at spectral absorption bands)

Return type

ndarray

Returns

spechomo.classifier_creation.get_filename_classifier_collection(method, src_satellite, src_sensor, n_clusters=1, **cls_kwinit)[source]

spechomo.classifier_creation.get_machine_learner(method='LR', **init_params)[source]

Get an instance of a machine learner.

Parameters

method (str) – ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees)
init_params (dict) – parameters to be passed to __init__() function of the returned machine learner model.

Return type

Union[LinearRegression, Ridge, Pipeline]

spechomo.clustering module

class spechomo.clustering.KMeansRSImage(im, n_clusters, sam_classassignment=False, CPUs=1, v=False)[source]

Bases: object

Class for clustering a given input image by using K-Means algorithm.

NOTE: Based on the nodata value of the input GeoArray those pixels that have nodata values in some bands are: ignored when computing the cluster coefficients. Nodata values would affect clustering result otherwise.

apply_clusters(image)[source]

property clustermap

property clusters

Return type: KMeans

compute_clusters(nmax_spectra=100000)[source]

Compute the cluster means and labels.

Parameters: nmax_spectra – maximum number of spectra to be included (pseudo-randomly selected (reproducable))
Returns

static compute_euclidian_distance_2D(spectra, endmembers)[source]

static compute_euclidian_distance_for_labelled_spectra(spectra, labels, endmembers)[source]

compute_spectral_angles()[source]

compute_spectral_distances()[source]

dump(path_out)[source]

classmethod from_disk(path_clf, im)[source]

Get an instance of KMeansRSImage from a previously saved classifier.

Parameters

path_clf (str) – path of serialzed classifier (dill file)
im (GeoArray) – path of the image cube belonging to that classifier

Return type

KMeansRSImage

Returns

KMeansRSImage

get_purest_spectra_from_each_cluster(samplesize=50)[source]

Return a given number of spectra directly surrounding the center of each cluster.

E.g., 50 spectra belonging to cluster 1, 50 spectra belonging to cluster 2 and so on.

Parameters: samplesize (int) – number of spectra to be selected from each cluster
Return type: dict
Returns

get_random_spectra_from_each_cluster(samplesize=50, max_distance=None, max_angle=None, nmin_unique_spectra=50)[source]

Return a given number of spectra randomly selected within each cluster.

E.g., 50 spectra belonging to cluster 1, 50 spectra belonging to cluster 2 and so on.

Parameters

samplesize (int) – number of spectra to be randomly selected from each cluster
max_distance (Union[int, float, str, None]) – spectra with a larger spectral distance than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral distance is computed as 20% percentile within each cluster
max_angle (Union[int, float, str, None]) – spectra with a larger spectral angle than the given value will be excluded from random sampling. - if given as string like ‘20%’, the maximum spectral angle is computed as 20% percentile within each cluster
nmin_unique_spectra (int) – in case a cluster has less than the given number, do not use its spectra (return missing values)

Return type

dict

Returns

property goodSpecMask

property labels: Get labels for all clustered spectra (excluding spectra that contain nodata values).

property labels_with_nodata: Get the labels for all pixels (including those containing nodata values).

property n_spectra: Get number of spectra used for clustering (excluding spectra containing nodata values).

plot_cluster_centers(figsize=(15, 5))[source]

Show a plot of the cluster center signatures.

Parameters: figsize (tuple) – figure size (inches)
Return type: None

plot_cluster_histogram(figsize=(15, 5))[source]

Show a histogram indicating the proportion of each cluster label in percent.

Parameters: figsize (tuple) – figure size (inches)
Return type: None

plot_clustermap(figsize=None)[source]

Show a the clustered image.

Parameters: figsize (Optional[tuple]) – figure size (inches)
Return type: None

save_clustermap(path_out, **kw_save)[source]

property spectra: Get spectra used for clustering (excluding spectra containing nodata values that would affect clustering).

property spectral_angles: Get spectral angles in degrees for all pixels that don’t contain nodata values.

property spectral_angles_with_nodata

property spectral_distances: Get spectral distances for all pixels that don’t contain nodata values.

property spectral_distances_with_nodata

spechomo.exceptions module

exception spechomo.exceptions.ClassifierNotAvailableError(spechomo_method, src_sat, src_sen, src_LBA, tgt_sat, tgt_sen, tgt_LBA, n_clusters)[source]: Bases: RuntimeError

spechomo.logging module

SpecHomo logging module containing logging related classes and functions.

class spechomo.logging.LessThanFilter(exclusive_maximum, name='')[source]

Bases: logging.Filter

Filter class to filter log messages by a maximum log level.

Based on http://stackoverflow.com/questions/2302315/: how-can-info-and-debug-logging-message-be-sent-to-stdout-and-higher-level-messag

Get an instance of LessThanFilter.

Parameters

exclusive_maximum – maximum log level, e.g., logger.WARNING
name –

filter(record)[source]

Filter funtion.

NOTE: Returns True if logging level of the given record is below the maximum log level.

Parameters: record –
Returns: bool

class spechomo.logging.SpecHomo_Logger(name_logfile, fmt_suffix=None, path_logfile=None, log_level='INFO', append=True)[source]

Bases: logging.Logger

Class for the SpecHomo logger.

Return a logging.logger instance pointing to the given logfile path.

Parameters

name_logfile (str) –
fmt_suffix (Optional[any]) – if given, it will be included into log formatter
path_logfile (Optional[str]) – if no path is given, only a StreamHandler is created
log_level (any) – the logging level to be used (choices: ‘DEBUG’, ‘INFO’, ‘WARNING’, ‘ERROR’, ‘CRITICAL’; default: ‘INFO’)
append (bool) – <bool> whether to append the log message to an existing logfile (1) or to create a new logfile (0); default=1

property captured_stream: str

Return the already captured logging stream.

NOTE:

set self.captured_stream:
self.captured_stream = ‘any string’

Return type: str

close()[source]: Close all logging handlers.

view_logfile()[source]: View the log file written to disk.

spechomo.logging.close_logger(logger)[source]

Close the handlers of the given logging.Logger instance.

Parameters: logger – logging.Logger instance or subclass instance

spechomo.logging.shutdown_loggers()[source]: Shutdown any currently active loggers.

spechomo.options module

spechomo.prediction module

Main module.

class spechomo.prediction.RSImage_ClusterPredictor(method='LR', n_clusters=50, classif_alg='MinDist', classifier_rootDir='', CPUs=1, logger=None, progress=True, **kw_clf_init)[source]

Bases: object

Predictor class applying the predict() function of a machine learning classifier described by the given args.

Get an instance of RSImage_ClusterPredictor.

Parameters

method (str) – machine learning approach to be used for spectral bands prediction ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)
n_clusters (int) – Number of spectral clusters to be used during LR/ RR/ QR homogenization. E.g., 50 means that the image to be converted to the spectral target sensor is clustered into 50 spectral clusters and one separate machine learner per cluster is applied to the input data to predict the homogenized image. If ‘n_clusters’ is set to 1, the source image is not clustered and only one machine learning classifier is used for prediction.
classif_alg (str) – algorithm to be used for image classification (to define which cluster each pixel belongs to) ‘MinDist’: Minimum Distance (Nearest Centroid) ‘kNN’: k-nearest-neighbour ‘kNN_MinDist’: k-nearest-neighbour Minimum Distance (Nearest Centroid) ‘SAM’: spectral angle mapping ‘kNN_SAM’: k-nearest-neighbour spectral angle mapping ‘SID’: spectral information divergence ‘FEDSA’: fused euclidian distance / spectral angle ‘kNN_FEDSA’: k-nearest-neighbour fused euclidian distance / spectral angle
classifier_rootDir (str) – root directory where machine learning classifiers are stored.
CPUs (Optional[int]) – number of CPUs to use (default: 1)
progress (bool) – whether to show progress bars
logger (Optional[Logger]) – instance of logging.Logger()

:param kw_clf_init keyword arguments to be passed to classifier init functions if possible,: e.g., ‘n_neighbours’ sets the number of neighbours to be considered in kNN classification algorithms (set by ‘classif_alg’)

compute_prediction_errors(im_predicted, cluster_classifier, nodataVal=None, cmap_nodataVal=None)[source]

Compute errors that quantify prediction inaccurracy per band and per pixel.

Parameters

im_predicted (Union[ndarray, GeoArray]) – 3D array representing the predicted image
cluster_classifier (Cluster_Learner) – instance of Cluster_Learner
nodataVal (Optional[float]) – no data value of the input image (auto-computed if not given or contained in im_predicted GeoArray) NOTE: The value is also used as output nodata value for the errors array.
cmap_nodataVal (Optional[float]) – no data value for the classification map in case more than one sub-classes are used for prediction

Return type

ndarray

Returns

3D array (int16) representing prediction errors per band and pixel

get_classifier(src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA)[source]

Select the correct machine learning classifier out of previously saved classifier collections.

Describe the classifier specifications with the given arguments. :type src_satellite: str :param src_satellite: source satellite, e.g., ‘Landsat-8’ :type src_sensor: str :param src_sensor: source sensor, e.g., ‘OLI_TIRS’ :type src_LBA: list :param src_LBA: source LayerBandsAssignment :type tgt_satellite: str :param tgt_satellite: target satellite, e.g., ‘Landsat-8’ :type tgt_sensor: str :param tgt_sensor: target sensor, e.g., ‘OLI_TIRS’ :type tgt_LBA: list :param tgt_LBA: target LayerBandsAssignment :rtype: Cluster_Learner :return: classifier instance loaded from disk

predict(image, classifier, in_nodataVal=None, out_nodataVal=None, cmap_nodataVal=- 9999, global_clf_threshold=None, unclassified_pixVal=- 1)[source]

Apply the prediction function of the given specifier to the given remote sensing image.

Parameters

image (Union[ndarray, GeoArray]) – 3D array representing the input image
classifier (Cluster_Learner) – the classifier instance
in_nodataVal (Optional[float]) – no data value of the input image (auto-computed if not given or contained in image GeoArray)
out_nodataVal (Optional[float]) – no data value written into the predicted image (copied from the input image if not given)
cmap_nodataVal (float) – no data value for the classification map in case more than one sub-classes are used for prediction (default: -9999)
global_clf_threshold (Union[int, float, str, None]) – If given, all pixels where the computed similarity metric (set by ‘classif_alg’) exceeds the given threshold are predicted using the global classifier (based on a single transformation per band). - not usable for ‘kNN’ - may be given as float, integer or string to label a certain distance percentile - if given as string, it must match the format, e.g., ‘10%’ for labelling the worst 10 % of the distances as unclassified
unclassified_pixVal (int) – pixel value to be used in the classification map for unclassified pixels (default: -1)

Return type

GeoArray

Returns

3D array representing the predicted spectral image cube

class spechomo.prediction.SpectralHomogenizer(classifier_rootDir='', logger=None, CPUs=None, progress=True)[source]

Bases: object

Class for applying spectral homogenization by applying an interpolation or machine learning approach.

Get instance of SpectralHomogenizer.

Parameters

classifier_rootDir – root directory where machine learning classifiers are stored.
logger – instance of logging.Logger
progress – whether to show progress bars

interpolate_cube(arrcube, source_CWLs, target_CWLs, kind='linear')[source]

Spectrally interpolate the spectral bands of a remote sensing image to new band positions.

Parameters

arrcube (Union[ndarray, GeoArray]) – array to be spectrally interpolated
source_CWLs (list) – list of source central wavelength positions
target_CWLs (list) – list of target central wavelength positions
kind (str) – interpolation kind to be passed to scipy.interpolate.interp1d (default: ‘linear’)

Return type

GeoArray

Returns

predict_by_machine_learner(arrcube, method, src_satellite, src_sensor, src_LBA, tgt_satellite, tgt_sensor, tgt_LBA, n_clusters=50, classif_alg='MinDist', kNN_n_neighbors=10, global_clf_threshold='10%', src_nodataVal=None, out_nodataVal=None, compute_errors=False, bandwise_errors=True, fallback_argskwargs=None)[source]

Predict spectral bands of target sensor by applying a machine learning approach.

NOTE: You may use the function spechomo.utils.list_available_transformations() to get a list of available: transformations. You may also copy the input parameters for this method from the output there.

Parameters

arrcube (Union[ndarray, GeoArray]) – input image array for target sensor spectral band prediction (rows x cols x bands)
method (str) – machine learning approach to be used for spectral bands prediction ‘LR’: Linear Regression ‘RR’: Ridge Regression ‘QR’: Quadratic Regression ‘RFR’: Random Forest Regression (50 trees; does not allow spectral sub-clustering)
src_satellite (str) – source satellite, e.g., ‘Landsat-8’
src_sensor (str) – source sensor, e.g., ‘OLI_TIRS’
src_LBA (list) – source LayerBandsAssignment # TODO document this
tgt_satellite (str) – target satellite, e.g., ‘Landsat-8’
tgt_sensor (str) – target sensor, e.g., ‘OLI_TIRS’
tgt_LBA (list) – target LayerBandsAssignment # TODO document this
n_clusters (int) – Number of spectral clusters to be used during LR/ RR/ QR homogenization. E.g., 50 means that the image to be converted to the spectral target sensor is clustered into 50 spectral clusters and one separate machine learner per cluster is applied to the input data to predict the homogenized image. If ‘spechomo_n_clusters’ is set to 1, the source image is not clustered and only one machine learning classifier is used for prediction.
classif_alg (str) – Multispectral classification algorithm to be used to determine the spectral cluster each pixel belongs to. ‘MinDist’: Minimum Distance (Nearest Centroid) ‘kNN’: k-nearest-neighbour ‘kNN_MinDist’: k-nearest-neighbour Minimum Distance (Nearest Centroid) ‘SAM’: spectral angle mapping ‘kNN_SAM’: k-nearest-neighbour spectral angle mapping ‘SID’: spectral information divergence ‘FEDSA’: fused euclidian distance / spectral angle ‘kNN_FEDSA’: k-nearest-neighbour fused euclidian distance / spectral angle
kNN_n_neighbors (int) – The number of neighbors to be considered in case ‘classif_alg’ is set to ‘kNN’. Otherwise, this parameter is ignored.
global_clf_threshold (Union[str, int, float]) –
If given, all pixels where the computed similarity metric (set by ‘classif_alg’) exceeds the given threshold are predicted using the global classifier (based on a single transformation per band). - only usable for ‘MinDist’, ‘SAM’ and ‘SID’ as well as their kNN variants - may be given as float, integer or string to label a certain distance percentile - if given as string, it must match the format, e.g., ‘10%’ for labelling the

worst 10 % of the distances as unclassified
src_nodataVal (Optional[int]) – no data value of source image (arrcube) - if no nodata value is set, it is tried to be auto-computed from arrcube
out_nodataVal (Optional[int]) – no data value of predicted image
compute_errors (bool) – whether to compute pixel- / bandwise model errors for estimated pixel values (default: false)

:param bandwise_errors whether to compute error information for each band separately (True - default): or to average errors over bands using median (False) (ignored in case of fallback)

Parameters: fallback_argskwargs (Optional[dict]) – arguments and keyword arguments to be passed to the fallback algorithm SpectralHomogenizer.interpolate_cube() in case harmonization fails
Returns: predicted array (rows x columns x bands)
Return type: Tuple[np.ndarray, Union[np.ndarray, None]]

spechomo.resampling module

class spechomo.resampling.SpectralResampler(wvl_src, rsr_tgt, logger=None)[source]

Bases: object

Class for spectral resampling of a single spectral signature (1D-array) or an image (3D-array).

Get an instance of the SpectralResampler class.

Parameters

wvl_src (np.ndarray) – center wavelength positions of the source spectrum
rsr_tgt (RSR) – relative spectral response of the target instrument as an instance of pyrsr.RelativeSpectralRespnse.

resample_image(image_cube, tiledims=(20, 20), nodataVal=None, alg_nodata='radical', CPUs=None)[source]

Resample the given spectral image cube according to the spectral response functions of the target instrument.

Parameters

image_cube (Union[GeoArray, ndarray]) – image (3D array) containing the spectral information in the third dimension
tiledims (tuple) – dimension of tiles to be used during computation (rows, columns)
nodataVal (Union[int, float, None]) – nodata value of the input image
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

(might alter the output spectral information,
e.g., at spectral absorption bands)
CPUs (Optional[int]) – CPUs to use for processing

Return type

ndarray

Returns

resampled spectral image cube

resample_signature(spectrum, scale_factor=10000, nodataVal=None, alg_nodata='radical', v=False)[source]

Resample the given spectrum according to the spectral response functions of the target instument.

Parameters

spectrum (ndarray) – spectral signature data
scale_factor (int) – the scale factor to apply to the given spectrum when it is plotted (default: 10000)
nodataVal (Union[int, float, None]) – no data value to be respected during resampling
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

(might alter the outpur spectral information,
e.g., at spectral absorption bands)
v (bool) – enable verbose mode (shows a plot of the resampled spectrum) (default: False)

Return type

ndarray

Returns

resampled spectral signature

resample_spectra(spectra, chunksize=200, nodataVal=None, alg_nodata='radical', CPUs=None)[source]

Resample the given spectral signatures according to the spectral response functions of the target instrument.

Parameters

spectra (Union[GeoArray, ndarray]) – spectral signatures, provided as 2D array (rows: spectral samples, columns: spectral information / bands)
chunksize (int) – defines how many spectral signatures are resampled per CPU
nodataVal (Union[int, float, None]) – no data value to be respected during resampling
alg_nodata (str) –
algorithm how to deal with pixels where the spectral bands of the source image contain nodata within the spectral response of a target band ‘radical’: set output band to nodata ‘conservative’: use existing spectral information and ignore nodata

(might alter the outpur spectral information,
e.g., at spectral absorption bands)
CPUs (Optional[int]) – CPUs to use for processing

Return type

ndarray

property rsr_1nm

property wvl_1nm

spechomo.training_data module

class spechomo.training_data.RefCube(filepath='', satellite='', sensor='', LayerBandsAssignment=None)[source]

Bases: object

Data model class for reference cubes holding the training data for later fitted machine learning classifiers.

Get instance of RefCube.

Parameters

filepath (str) – file path for importing an existing reference cube from disk
satellite (str) – the satellite for which the reference cube holds its spectral data
sensor (str) – the sensor for which the reference cube holds its spectral data
LayerBandsAssignment (Optional[list]) – the LayerBandsAssignment for which the reference cube holds its spectral data

add_refcube_array(refcube_array, src_imnames, LayerBandsAssignment)[source]

Add the given given array to the RefCube instance.

Parameters

refcube_array (Union[str, ndarray]) – 3D array or file path of the reference cube to be added (spectral samples /signatures x training images x spectral bands)
src_imnames (list) – list of training image file base names from which the given cube received data
LayerBandsAssignment (list) – LayerBandsAssignment of the spectral bands of the given 3D array

Return type

None

Returns

add_spectra(spectra, src_imname, LayerBandsAssignment)[source]

Add a set of spectral signatures to the reference cube.

Parameters

spectra (ndarray) – 2D numpy array with rows: spectral samples / columns: spectral information (bands)
src_imname (str) – image basename of the source hyperspectral image
LayerBandsAssignment (list) – LayerBandsAssignment for the spectral dimension of the passed spectra, e.g., [‘1’, ‘2’, ‘3’, ‘4’, ‘5’, ‘6L’, ‘6H’, ‘7’, ‘8’]

Return type

None

property col_imName_dict

Return an ordered dict containing the file base names of the original training images for each column.

Return type: OrderedDict

get_band_combination(tgt_LBA)[source]

Get an array according to the bands order given by a target LayerBandsAssignment.

Parameters: tgt_LBA (List[str]) – target LayerBandsAssignment
Return type: GeoArray
Returns

get_spectra_dataframe(tgt_LBA)[source]

Return a pandas.DataFrame [sample x band] according to the given LayerBandsAssignment.

Parameters: tgt_LBA (List[str]) – target LayerBandsAssignment
Return type: DataFrame
Returns

property metadata: Return an ordered dictionary holding the metadata of the reference cube.

property n_clusters: Return the number spectral clusters used for clustering source images for the reference cube.

property n_images: Return the number training images from which the reference cube contains spectral samples.

property n_signatures: Return the number spectral signatures per training image included in the reference cube.

property n_signatures_per_cluster

plot_sample_spectra(image_basename, cluster_label='all', include_mean_spectrum=True, include_median_spectrum=True, ncols=5, **kw_fig)[source]

Return type: plt.figure

read_data_from_disk(filepath)[source]

rearrange_layers(tgt_LBA)[source]

Rearrange the spectral bands of the reference cube according to the given LayerBandsAssignment.

Parameters: tgt_LBA (List[str]) – target LayerBandsAssignment
Return type: None

save(path_out, fmt='ENVI')[source]

Save the reference cube to disk.

Parameters

path_out (str) – output path on disk
fmt (str) – output format as GDAL format code

Return type

None

Returns

property wavelengths

class spechomo.training_data.TrainingData(im_X, im_Y, test_size)[source]

Bases: object

Class for analyzing statistical relations between a pair of machine learning training data cubes.

Get instance of TrainingData.

Parameters

im_X (Union[GeoArray, ndarray]) – input image X
im_Y (Union[GeoArray, ndarray]) – input image Y
test_size (Union[float, int]) – test size (proportion as float between 0 and 1) or absolute value as integer

plot_scatter_matrix(figsize=(15, 15), mode='intersensor')[source]

plot_scattermatrix()[source]

show_band_scatterplot(band_src_im, band_tgt_im)[source]

spechomo.utils module

spechomo.utils.download_pretrained_classifiers(method, tgt_dir='/builds/geomultisens/spechomo/spechomo/resources/classifiers')[source]

spechomo.utils.explore_classifer_dillfile(path_dillFile)[source]

List all homogenization transformations included in the given .dill file.

Parameters: path_dillFile (str) –
Return type: DataFrame
Returns

spechomo.utils.export_classifiers_as_JSON(export_rootDir, classifier_rootDir='/builds/geomultisens/spechomo/spechomo/resources/classifiers', method=None, src_sat=None, src_sen=None, src_LBA=None, tgt_sat=None, tgt_sen=None, tgt_LBA=None, n_clusters=None)[source]

Export spectral harmonization classifiers as JSON files that match the provided filtering criteria.

NOTE: So far, this function will only work for LR classifiers.

Parameters

export_rootDir (str) – directory where to save the exported JSON files
classifier_rootDir (str) – directory containing classifiers for homogenization, either as .zip archives or as .dill files
method (Optional[str]) – filter by the machine learning approach to be used for spectral bands prediction
src_sat (Optional[str]) – filter by source satellite, e.g., ‘Landsat-8’
src_sen (Optional[str]) – filter by source sensor, e.g., ‘OLI_TIRS’
src_LBA (Optional[List[str]]) – filter by source bands list
tgt_sat (Optional[str]) – filter by target satellite, e.g., ‘Landsat-8’
tgt_sen (Optional[str]) – filter by target sensor, e.g., ‘OLI_TIRS’
tgt_LBA (Optional[List[str]]) – filter by target bands list
n_clusters (Optional[int]) – filter by the number of spectral clusters to be used during LR/ RR/ QR homogenization

Return type

None

Returns

spechomo.utils.im2spectra(geoArr)[source]

Convert 3D images to array of spectra samples (rows: samples; cols: spectral information).

Return type: ndarray

spechomo.utils.list_available_transformations(classifier_rootDir='/builds/geomultisens/spechomo/spechomo/resources/classifiers', method=None, src_sat=None, src_sen=None, src_LBA=None, tgt_sat=None, tgt_sen=None, tgt_LBA=None, n_clusters=None)[source]

List all sensor transformations available according to the given classifier root directory.

NOTE: This function can be used to copy/paste possible input parameters for: spechomo.SpectralHomogenizer.predict_by_machine_learner().

Parameters

classifier_rootDir (str) – directory containing classifiers for homogenization, either as .zip archives or as .dill files
method (Optional[str]) – filter results by the machine learning approach to be used for spectral bands prediction
src_sat (Optional[str]) – filter results by source satellite, e.g., ‘Landsat-8’
src_sen (Optional[str]) – filter results by source sensor, e.g., ‘OLI_TIRS’
src_LBA (Optional[List[str]]) – filter results by source bands list
tgt_sat (Optional[str]) – filter results by target satellite, e.g., ‘Landsat-8’
tgt_sen (Optional[str]) – filter results by target sensor, e.g., ‘OLI_TIRS’
tgt_LBA (Optional[List[str]]) – filter results by target bands list
n_clusters (Optional[int]) – filter results by the number of spectral clusters to be used during LR/ RR/ QR homogenization

Return type

DataFrame

Returns

pandas.DataFrame listing all the available transformations

spechomo.utils.spectra2im(spectra, tgt_rows, tgt_cols)[source]

Convert array of spectra samples (rows: samples; cols: spectral information) to a 3D image.

Parameters

spectra (Union[GeoArray, ndarray]) – 2D array with rows: spectral samples / columns: spectral information (bands)
tgt_rows (int) – number of target image rows
tgt_cols (int) – number of target image rows

Return type

ndarray

Returns

3D array (rows x columns x spectral bands)

spechomo package

Submodules

spechomo.classifier module

spechomo.classifier_creation module

spechomo.clustering module

spechomo.exceptions module

spechomo.logging module

spechomo.options module

spechomo.prediction module

spechomo.resampling module

spechomo.training_data module

spechomo.utils module

spechomo.version module

Module contents