Abstract
The behavior of metadata server (MDS) cluster is critically important to the overall performance of today's petabyte-scale or even exabyte-scale distributed file system. How to maintain a high level of both system locality and load balancing is a significant challenge to MDS clusters. However, traditional metadata management schemes, including hash-based mapping and subtree partitioning, have severe bias on either system locality or load balancing. In this paper, we propose D^2-Tree, a distributed double-layer namespace tree partition scheme, for metadata management in large-scale storage systems. The innovative idea is to design a greedy strategy to split the namespace tree into global layer and local layer subtrees, of which global layer is replicated to maintain load balancing and the lower-half subtrees are allocated separately to MDS's by a mirror division method to preserve locality. Both theoretical analysis based on empirical cumulative distribution and extensive experiments are provided to validate the efficiency of D^2-Tree. Experiments using actual trace data on Amazon EC2 also exhibit the superior performance of D^2-Tree compared with much previous literature.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have