Abstract
As the computing power of high-performance computing (HPC) systems is developing to exascale, the storage systems are stretched to their limits to process the growing I/O traffic. Researchers are building storage systems on top of compute node-local fast storage devices (such as NVMe SSD) to alleviate the I/O bottleneck. However, user jobs have varying requirements of I/O bandwidth; therefore, it is a serious waste of expensive storage devices to have them on all compute nodes and build them into a global storage system. In addition, current node-local storage systems need to cope with the challenging small I/O and rank 0 I/O pattern from HPC workloads. In this paper, we presented a workload-aware temporary cache (WatCache) to meet above challenges. We designed a workload-aware node allocation method to allocate fast storage devices to jobs according to their I/O requirements and merged the devices of the jobs into separate temporary cache spaces. We implemented a metadata caching strategy that reduces the metadata overhead of I/O requests to improve the performance of small I/O. We designed a data layout strategy that distributes consecutive data that exceeds a threshold to multiple devices to achieve higher aggregate bandwidth for rank 0 I/O. Through extensive tests with several I/O benchmarks and applications, we have validated that WatCache offers linearly scalable performance, and brings significant performance promotions to small I/O and rank 0 I/O patterns.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.