Skip to content

Overview

Coscine offers a variety of different storage systems for resources. Object storage-based systems such as the S3 storage store all data in so-called buckets. To uniquely identify data, the bucket name, the object key (name of the object), and the endpoint of the web service are required.

Access

Access to data in an S3 bucket is bound to access rights. To access buckets in S3 storage, so-called access keys must be used, which can be associated with either read or write permissions. They are located on the respective resource page: To view them, open the desired S3 resource and click on the gray button with the list icon. It is located next to the resource name. Only project members with the Owner or Member roles can see the access keys, as they have write access. Guests who only have read access cannot see them.

Interaction options

S3 resources can be interacted with by using the Coscine Userinterface, the REST API or S3-Clients. If access keys and bucket ID are known, S3 libraries such as boto3 or the S3-Clients presented below can be used to interact directly with the data.

Note

The maximum filesize for file transfers differs depending on the method. Via Coscine userinterface or the API the limit is 100 GB per file. Files larger then that can be transfered via S3-Client.

Reasons for using S3-clients

Using S3 resources is recommended when large volumes of data are present or expected, or when data documentation requires a separation of data and metadata. Unlike with web resources, however, metadata is not entered per file in the web interface, but can also be stored collectively in external files.

Obtaining Storage Space for S3 Resources

Since Coscine adheres to the FAIR principles, it must be ensured that data is adequately described with metadata. Because S3 buckets can be used for communication without interaction with Coscine, research data management must be described in storage requests before S3 storage space can be allocated. This includes, among other things, specifying the metadata profiles used and how the metadata is obtained and stored. Further information can be found in the Coscine documentation under the keyword storage space application.

Usage of S3-Clients

S3 clients provide a direct connection to the S3 storage. They act completely independently of Coscine and therefore provide access to your data even if the Coscine web interface fails. For example, files can be uploaded or downloaded via S3 clients, but also edited accordingly. Since the S3 clients "communicate" directly with the underlying storage system, a faster upload and download of (larger) files is usually possible. However, for smaller files ( < 5Gb ) this difference hardly matters.

In the following documentation, we present the S3 applications Cyberduck, MinIO and WinSCP in more detail, with which an uncomplicated connection to the Coscine S3 storage is possible.

What does an example of interaction with the S3 storage using Python look like?

An example of the interaction with the S3 storage with Python is summarized in the following GitLab repository and can be used as inspiration for creating your own S3 clients.

S3 Sample Script

Performance

Upload and download performance depends primarily on your device, network performance, the S3 client, and the data storage on the other end of the transfer.

Tests with various clients showed that data storage.nrw is capable of transferring several hundred MB per second. In practice, however, these speeds are often not achieved due to the weaker performance of home networks.

You should start troubleshooting by examining your own device and network. It is recommended to perform speed tests to measure upload and download speeds. These tests are offered by various providers online, like the Broadband measuring of the german Bundesnetzagentur.

If the measured values ​​and transfer rates in the S3 client differ significantly, the S3 client may be the bottleneck. In particular, aspects such as the number of parallel connections or encryption can affect the speed. Please also check the manufacturer's support pages or online communities for optimization options or known issues.

Note regarding files and transfer speed: File size can affect the speed. When transferring large files of several hundred MB, the speed usually increases after the transfer begins. The number of files can also have an impact if parallel connections are enabled as an option in the S3 client.

S3 policies

Advanced user can use s3 policies via the command line interface (cli) of minIO to get further information on buckets or files. You can get more information under the following Link: Link to s3-policies overview