Add new data to the database

There are three ways to add new satellite data to the locally stored database. You can use the WebUI, you can run the data downloader from the command line or you add the data manually.

In each case, two steps have to be carried out:

  • the downloaded provider archive data need to be physically copied to the data storage directory on disk

  • the respective metadata entries need to be added to the GeoMultiSens metadata database

Hint

Regarding the metadata entry, these conditions must be fulfilled to make GeoMultiSens recognize a dataset as properly added:

  • the ‘scenes’ table of the GeoMultiSens metadata database must contain a corresponding entry at all (if the entry is not there, the database needs to be updated by the metadata crawler which has to be done by the database administrator)

  • the ‘filename’ column of the respective entry in the ‘scenes’ table must contain a valid filename string

  • the ‘proc_status’ column of the respective entry in the ‘scenes’ table must at least be ‘DOWNLOADED’

Using the data downloader

The GeoMultiSens data downloader downloads the requested data and makes sure that the new dataset is properly added to the local GeoMultiSens data storage directory as well as to the metadata database.

All you need to do is:

cd /opt/gms-modules  # default installation path of gms-modules
bash gms-cli-frontend --download 13552123

This downloads the satellite provider archive that belongs to scene 13552123 of the GeoMultiSens metadata database. When using the WebUI, these scene IDs are automatically passed to downloader module. However, when running the data downloader from command line as shown above, you need to know the scene IDs of the scenes you want to download.

To find out these scene IDs, you can query the GeoMultiSens metadata database as follows:

from gms_preprocessing.options.config import get_conn_database
from gms_preprocessing.misc.database_tools import get_info_from_postgreSQLdb

get_info_from_postgreSQLdb(
    conn_params=get_conn_database('localhost'),
    tablename='scenes',
    vals2return=['id'],
    cond_dict={
        'entityid': ['LE70450322008300EDC00',
                     'LE70450322008284EDC01'
        }
    )

This returns the scene IDs of two Landsat-7 scenes with the entity IDs ‘LE70450322008300EDC00’ and ‘LE70450322008284EDC01’:

OUT:
[(13547246,), (13552123,)]

Add new data manually

You can also add datasets to the local GeoMultiSens data storage which you previously downloaded on your own (e.g., via EarthExplorer or the Copernicus Open Access Hub).

The following code snippet will exemplarily import two Landat-7 scenes into the GeoMultiSens database:

from gms_preprocessing.options.config import get_conn_database
from gms_preprocessing.misc.database_tools import add_externally_downloaded_data_to_GMSDB

add_externally_downloaded_data_to_GMSDB(
    conn_DB=get_conn_database('localhost'),
    src_folder='/path/to/your/downloaded_data_directory/',
    filenames=['LE71510322000093SGS00.tar.gz',
               'LE71910232012294ASN00.tar.gz'],
    satellite'Landsat-7',
    sensor='ETM+'
    )

However, this currently only works for Landsat legacy data or if the given filenames are already known in the GeoMultiSens metadata database.

In other cases, you have to:

  1. copy the provider data archives to the GeoMultiSens data storage directory (choose the proper sub-directory corresponding to the right sensor)

  2. register the new datasets in the GeoMultiSens metadata database as follows:

from gms_preprocessing.options.config import get_conn_database
from gms_preprocessing.misc.database_tools import update_records_in_postgreSQLdb

entityids = ["LE70450322008300EDC00",
             "LE70450322008284EDC01"]
filenames = ["LE07_L1TP_045032_20081026_20160918_01_T1.tar.gz",
             "LE07_L1TP_045032_20081010_20160918_01_T1.tar.gz"]

for eN, fN in zip(entityids, filenames):
    update_records_in_postgreSQLdb(conn_params=get_conn_database('localhost'),
                                   tablename='scenes',
                                   vals2update_dict={
                                        'filename': fN,
                                        'proc_level': 'DOWNLOADED'},
                                   cond_dict={
                                        'entityid': eN
                                        }
                                   )