Back to Search Start Over

MLBS: Transparent Data Caching in Hierarchical Storage for Out-of-Core HPC Applications

Authors :
Rached Abdelkhalak
Tariq Alturkestani
Hatem Ltaief
David E. Keyes
V. Etienne
Thierry Tonellot
Source :
HiPC
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Out-of-core simulation systems produce and/or consume a massive amount of data that cannot fit on a single compute node memory and that usually needs to be read and/or written back and forth during computation. I/O data movement may thus represent a bottleneck in large-scale simulations. To increase I/O bandwidth, high-end supercomputers are equipped with hierarchical storage subsystems such as node-local and remote-shared NVMe and SSD-based Burst Buffers. Advanced caching systems have recently been developed to efficiently utilize the multi-layered nature of the new storage hierarchy. Utilization of software components results in more efficient data accesses, at the cost of reduced computation kernel performance and limited numbers of simultaneous applications that can utilize the additional storage layers. We introduce MultiLayered Buffer Storage (MLBS), a data object container that provides novel methods for caching and prefetching data in out-of-core scientific applications to perform asynchronously expensive I/O operations on systems equipped with hierarchical storage. The main idea consists in decoupling I/O operations from computational phases using dedicated hardware resources to perform expensive context switches. MLBS monitors I/O traffic in each storage layer allowing fair utilization of shared resources while controlling the impact on kernels' performance. By continually prefetching up and down across all hardware layers of the memory/storage subsystems, MLBS transforms the original I/O-bound behavior of evaluated applications and shifts it closer to a memory-bound regime. Our evaluation on a Cray XC40 system for a representative I/O-bound application, seismic inversion, shows that MLBS outperforms state-of-the-art filesystems, i.e., Lustre, Data Elevator and DataWarp by 6.06X, 2.23X, and 1.90X, respectively.

Details

Database :
OpenAIRE
Journal :
2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC)
Accession number :
edsair.doi...........cf07dcc08a0988bd301d501a168bfc43
Full Text :
https://doi.org/10.1109/hipc.2019.00046