Abstract

The exponential growth of computing power on leadership scale computing platforms imposes grand challenge to scientific applications’ input/output (I/O) performance. To bridge the performance gap between computation and I/O, various parallel I/O libraries have been developed and adopted by computer scientists. These libraries enhance the I/O parallelism by allowing multiple processes to concurrently access the shared data set. Meanwhile, they are integrated with a set of I/O optimization strategies such as data sieving and two-phase I/O to better exploit the supplied bandwidth of the underlying parallel file system. Most of these techniques are optimized for the access on a single bundle of variables generated by the scientific applications during the I/O phase, which is stored in the form of file. Few of these techniques focus on cross-bundle I/O optimizations. In this article, we investigate the potential benefit from cross-bundle I/O aggregation. Based on the analysis of the I/O patterns of a mission-critical scientific application named the Goddard Earth Observing System, version 5 (GEOS-5), we propose a Bundle-based PARallel Aggregation (BPAR) framework with three partitioning schemes to improve its I/O performance as well as the I/O performance of a broad range of other scientific applications. Our experiment result reveals that BPAR can deliver 2.1× I/O performance improvement over the baseline GEOS-5, and it is very promising in accelerating scientific applications’ I/O performance on various computing platforms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call