Abstract
In distributed storage systems, fault-tolerant methods such as replication or erasure coding are adopted to guarantee data reliability. These methods ensure that data could be recovered via a redundancy mechanism when any storage node suffers a failure. However, this redundancy mechanism often incurs nontrivial bandwidth overhead to transmit quantities of replicas and blocks. Prior methods focus on how to reduce the network cost through careful scheduling. In this article, we aim to improve the transmission efficiency from an orthogonal dimension, i.e., optimizing the storage locations according to the characteristics of data center networks. We focus on server-centric data centers (such as BCube), where any pair of nodes are interconnected with multiple redundant paths. Thus, transmissions for replicas or blocks can be significantly speeded up via utilizing the redundant paths concurrently. Inspired by this insight, we design the node-disjoint storage strategy and the nested node-disjoint storage strategy for the multireplica storage system and the erasure-coded storage system, respectively. Evaluations indicate that our methods can save 46.6%–62.1% of the transmission time in the multireplica storage system and 71.5%–80.8% of the transmission time in the erasure-coded storage system, compared with conventional methods adopted in current storage systems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.