Abstract

Recent years have seen an increasing number of Hybrid Scientific Applications. They often consist of one HPC simulation program along with its corresponding data analytics programs. Unfortunately, current computing platform settings do not accommodate this emerging workflow very well, especially write-once-read-many workflows. This is mainly because HPC simulation programs store output data into a dedicated storage cluster equipped with Parallel File System(PFS). To perform analytics on data generated by simulation, data has to be migrated from storage cluster to compute cluster. This data migration could introduce severe delay which is especially true given an ever-increasing data size.To solve the data migration problem in small-medium sized HPC clusters, we propose to construct a sided I/O path, named as SideIO, to explicitly direct analysis data to data-intensive file systems (DIFS in brief) that co-locates computation with data. In contrast, checkpoint data may not be read back later, it is written to the dedicated PFS to maximize I/O throughput. There are three components in SideIO. An I/O splitter separates simulation outputs to different storage systems (PFS or DIFS); an I/O middle-ware component allows original HPC simulation programs to execute direct I/O operations over DIFS without any porting effort and an I/O scheduler dynamically smooths out both disk write and read traffic for both simulation and analysis programs. By experimenting with two real-world scientific workflows over a 46-node SideIO prototype, we found that SideIO is able to achieve comparable read/write I/O performance in small-medium sized HPC clusters equipped with PFS. More importantly, since SideIO completely avoids the most expensive data movement overhead, it achieves up to 3x speedups for hybrid scientific workflow applications compared with current solutions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.