Data management is generally challenging, particularly on HPC systems. Modern HPC systems offer different storage tiers with different characteristics. Some of these characteristics are for instance the availability of backups, the storage capacity, the IO performance, the difference between node local and globally available access, the semantics of the storage system, and the duration for which the storage endpoint is available, ranging from years to quarters, and sometimes only hours. This is confusing and entails different challenges and risks. First of all, users have to be aware of the different storage tiers and their performance profiles to optimize their job runtimes and not leave their jobs starving for data or wait for minutes that a Python environment has been loaded. However, users then need to move their results back to a storage tier with enough space and durability, to not lose their results at the end of a computation or soon after. While moving input and output data around users have to keep oversight over the data provenance to ensure the reproducibility and retrospective comprehensibility of their research. In addition, sometimes users don't just want to copy an entire data set but want to explore only a concise subset. For this, a data catalog can be used where all available data is indexed with respect to some domain-specific metadata. Once this data catalog is filled with all the data sets of a user, concise queries can be used to select the input data, and ideally, stage it to the correct storage tier as part of the job submission process. This data catalog can also be used to keep the oversight of all data that are distributed over the different storage tiers.This course will provide an introduction to the different storage tiers available at GWDG and for what workloads they should be used. Then the concept of a data catalog and its usage of will be covered. Both parts will offer hands-on exercises on our HPC system.
25
Some basic experience with working with HPC systems
Online (BigBlueButton)
This event includes following dates: