Abstract
Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts the efficiency of high performance computing (HPC) applications. However, such a low-level optimization is in general challenging, especially when using popular scientific file formats designed with an emphasis on portability and flexibility. To reconcile these two aspects, we present a novel low-level data layout for HPC applications, fully independent of the number of dimensions in the dataset. The new data layout improves reading and writing efficiency in large HPC applications using many processors, and in particular during parallel post-processing. Furthermore, its combination with a cached write mode, in order to aggregate multiple writes into larger ones, substantially decreased the writing times of the proposed strategy. When applied to our simulation framework for the forward calculation of the human electrocardiogram, the combined strategy resulted in drastic improvements in I/O performance, of up to 40% in writing and 93–98% in reading for post-processing tasks. Given the generality of the proposed strategies and scientific file formats used, our results may represent significant improvements in I/O performance of HPC applications across multiple disciplines, reducing execution and post-processing times and leading to a more efficient use of HPC resource envelopes.
Highlights
The optimization of high performance computing (HPC) codes is an area of active research, underpinning a continuous and cost-effective development of both established and emergent industrial and scientific sectors
The progress we are experiencing in computational medicine based on HPC applications is allowing the translation of mathematical models of physiological systems such as the heart to biomedical research and clinical practice
When combined with a cached write mode, the new algorithm resulted in overall improvements in I/O performance of up to 40% in writing to disk, and between 93% to 98% in reading for different post-processing tasks
Summary
The optimization of high performance computing (HPC) codes is an area of active research, underpinning a continuous and cost-effective development of both established and emergent industrial and scientific sectors. The progress we are experiencing in computational medicine based on HPC applications is allowing the translation of mathematical models of physiological systems such as the heart to biomedical research and clinical practice. Input-output optimization in high performance scientific computing. Basic Science Research Fellowship (FS/17/22/ 32644; https://www.bhf.org.uk/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.