Abstract

Efficient caching of file data is critical in order to achieve high performance in data-intensive applications. However, only a limited amount of memory is usually available to cache files in client nodes even on high-performance computing platforms. Cooperative caching is an approach that enables client nodes to share memory for file caching and thereby provide a large amount of memory for the file cache in the aggregate. Many studies have confirmed the efficacy of applying cooperative caching to distributed file systems. However, to the best of our knowledge, no study has evaluated an implementation of cooperative caching integrated into a modern distributed file system running on a high-speed network. In this paper, we propose a method that improves the performance of a distributed file system oriented to high-performance computing by integrating cooperative caching into it. In the proposed method, the metadata server of the distributed file system maintains information about the cache in all client nodes, and provides clients with the predicted cache location of any requested file. Further, InfiniBand RDMA is utilized to achieve fast cache transfer between the page caches of client nodes. Implementation of the proposed method in the Gfarm distributed file system and measurement of the performance of three real-world data-intensive applications indicate that the proposed method achieves a maximum speedup of 5.8%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call