Sort by
HDF5 Cache VOL: Efficient and Scalable Parallel I/O through Caching Data on Node-local Storage

Modern-era high performance computing (HPC) systems are providing multiple levels of memory and storage layers to bridge the performance gap between fast memory and slow disk-based storage system managed by Lustre or GPFS. Several of the recent HPC systems are equipped with SSD and NVMe-based storage that is attached locally to compute nodes. A few systems are providing an SSD-based “burst buffer” intermediate storage layer that is accessible by all compute nodes as a single file system. Although these hardware layers are intended to reduce the latency gap between memory and disk-based long-term storage, how to utilize them has been left to the users. High-level I/O libraries, such as HDF5 and netCDF, can potentially take advantage of the node-local storage as a cache for reducing I/O latency from capacity storage. However, it is challenging to use node-local storage in parallel I/O especially for a single shared file. In this paper, we present an approach to integrate node-local storage as transparent caching or staging layers in a high-level parallel I/O library without placing the burden of managing these layers on users. We designed this to move data asynchronously between the caching storage layer and a parallel file system to overlap the data movement overhead in performing I/O with compute phases. We implement this approach as an external HDF5 Virtual Object Layer (VOL) connector, named Cache VOL. HDF5 VOL is a layer of abstraction in HDF5 that allows intercepting the public HDF5 application programming interface (API) and performing various optimizations to data movement after the interception. Existing HDF5 applications can use Cache VOL with minimal code modifications. We evaluated the performance of Cache VOL in HPC applications such as VPIC-10, and deep learning applications such as ImageNet and CosmoFlow. We show that using Cache VOL, one can achieve higher observed I/O performance, more scalable and stable I/O compared to direct I/O to the parallel file system, thus achieving faster time-to-solution in scientific simulations. While the caching approach is implemented in HDF5, the methods are applicable in other high-level I/O libraries.

Relevant
ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include: Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.

Open Access
Relevant