Metadata-Extraction

Metadata extraction and metadata templating will be available in Coscine from mid-July. Initially, however, the feature will only be activated on request so that test users can try it out and tell us about their experiences.

If you would like to try out the feature, please read through this page and send a request to servicedesk@rwth-aachen.de if you are interested. Please enter the URL(s) of the resource(s) for which you would like to activate the feature.

!!! Info “Note” Please note that only owners should make such a request.

The aim of metadata extraction is to save time when entering metadata and to increase completeness. Metadata extraction allows metadata to be automatically extracted from a file, e.g. an image, and thus saved directly. A simple example of this is the image of a fruit basket with different types of fruit (apples, bananas and oranges). If the metadata extraction is applied to it, information can be read out that it is an image, that e.g. 5 apples, 3 bananas and 6 oranges can be seen as well as the size of the image and so on.

If the extraction of the metadata does not work successfully, this may be due to the extractor for the file format you are using. It is therefore best to check the GitLab metadata extraction project to see whether a corresponding extractor has already been written for your file format. If not, this would be a first step towards a possible solution.

Metadata templating also makes it easier to save metadata. You can find the feature in the settings of your resource (after activation) under the Application profile tab. If you scroll down to the bottom, you will see the “Open extraction template” button at the bottom left. Then enter a file to which the templating should be applied. You will then see the application profile on the right and the extracted metadata on the left. The extracted metadata can now be used to create a template for your metadata fields. To do this, you can simply drag the values from left to right. Please note the settings for the metadata fields. If a metadata field requires the entry of a string, for example, and you enter something else there, an error message will appear.

You can also find in the GitLab Repository a simple example of how you can extract metadata with a Python script. You are also welcome to try this out yourself and contribute to best practice.

In the future, the feature will be able to be activated by simply switching owners and members on and off on the Coscine platform. When this has been implemented, you will be informed via our mailing list.