1. h5bench: A unified benchmark suite for evaluating HDF5 I/O performance on pre‐exascale platforms
- Author
-
Bez, Jean Luca, Tang, Houjun, Breitenfeld, Scot, Zheng, Huihuo, Liao, Wei‐Keng, Hou, Kaiyuan, Huang, Zanhua, and Byna, Suren
- Subjects
Data Management and Data Science ,Distributed Computing and Systems Software ,Information and Computing Sciences ,Applied Computing ,HDF5 ,I/O access patterns ,I/O benchmarks ,I/O performance ,Artificial Intelligence and Image Processing ,Computer Software ,Distributed Computing ,Information and computing sciences - Abstract
Parallel I/O is a critical technique for moving data between compute and storage subsystems of supercomputers. With massive amounts of data produced or consumed by compute nodes, high-performant parallel I/O is essential. I/O benchmarks play an important role in this process; however, there is a scarcity of I/O benchmarks representative of current workloads on HPC systems. Toward creating representative I/O kernels from real-world applications, we have created h5bench, a set of I/O kernels that exercise hierarchical data format version 5 (HDF5) I/O on parallel file systems in numerous dimensions. Our focus on HDF5 is due to the parallel I/O library's heavy usage in various scientific applications running on supercomputing systems. The various tests benchmarked in the h5bench suite include I/O operations (read and write), data locality (arrays of basic data types and arrays of structures), array dimensionality (one-dimensional arrays, two-dimensional meshes, three-dimensional cubes), I/O modes (synchronous and asynchronous). In this paper, we present the observed performance of h5bench executed along several of these dimensions on existing supercomputers (Cori and Summit) and pre-exascale platforms (Perlmutter, Theta, and Polaris). h5bench measurements can be used to identify performance bottlenecks and their root causes and evaluate I/O optimizations. As the I/O patterns of h5bench are diverse and capture the I/O behaviors of various HPC applications, this study will be helpful to the broader supercomputing and I/O community.
- Published
- 2024