Abstract

Traditional distributed file systems (DFS) use centralized service to manage metadata. Many studies based on this centralized architecture enhanced metadata processing capability by scaling the metadata server cluster, which is however still difficult to keep up with the growing number of clients and the increasingly metadata-intensive applications. Some solutions abandoned the centralized metadata service and improved scalability by embedding a private metadata service in an HPC application, but these solutions are suitable for only some specific applications and the absence of global namespace makes data sharing and management difficult. This paper addresses the shortcomings of existing studies by optimizing the consistency model of client- side metadata cache for the HPC scenario using a novel partial consistency model. It provides the application with strong consistency guarantee for only its workspace, thus improving metadata scalability without adding hardware or sacrificing the versatility and manageability of DFSes. In addition, the paper proposes batch permission management to reduce path traversal overhead, thereby improving metadata processing efficiency. The result is a library (Pacon) that allows existing DFSes to achieve partial consistency for scalable and efficient metadata management. The paper also presents a comprehensive evaluation using intensive benchmarks and representative application. For example, in file creation, Pacon improves the performance of BeeGFS by more than 76.4 times, and outperforms the state-of-the-art metadata management solution (IndexFS) by more than 4.6 times.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.