The scientific community relies heavily on the sharing of standardized datasets, such as ImageNet or Sentinel-2 imagery. To host these popular datasets in a central store, the GWDG offers the Data Pools service. Compared to conventional cloud-based approaches, we achieve significantly higher performance with Data Pools when running on our HPC systems. Additionally, the GWDG provides a number of standard datasets and derived data products, such as machine learning models. This service is not only for users to consume data but also allows them to share and host their own versioned datasets within our HPC systems. Other users of our systems can then use your dataset or data products to conduct their own research. Data Pools are specifically designed for the scientific community, providing versioned datasets that are citable.Within this course, we will teach you how to discover existing Data Pools and how to publish your own dataset as a Data Pool to share it with others.
50
Basic Linux and HPC experience
Online (BigBlueButton)
This event includes following dates: