Data management is generally challenging, but particularly on HPC systems. Due to the tiered storage systems, data may reside on different storage systems. Particularly data-intensive research often have large data sets, with many files. Using the well-established practice of encoding semantic metadata in paths and filenames can quickly accumulate, rendering it hard to employ on very big data sets.A different approach is to use a data catalog, where a set of metadata tags can be indexed and associated with individual files. This allows to identify and access files based on semantic queries, not based on overly complicated paths.This course will provide a basic introduction into the Data Catalog tool provided by the GWDG on all of its HPC systems. Following a short presentation, participants can explore the tool during a hands-on session on their own.
25
Online (BigBlueButton)
This event includes following dates: