On Data Parallelism of Erasure Coding in Distributed Storage Systems

Jun Li,Baochun Li

doi:10.1109/icdcs.2017.191

Abstract

Deployed in various distributed storage systems, erasure coding has demonstrated its advantages of low storage overhead and high failure tolerance. Typically in an erasure-coded distributed storage system, systematic maximum distance seperable (MDS) codes are chosen since the optimal storage overhead can be achieved and meanwhile data can be read directly without decoding operations. However, data parallelism of existing MDS codes is limited, because we can only read data from some specific servers in parallel without decoding operations. In this paper, we propose Carousel codes, designed to allow data to be read from an arbitrary number of servers in parallel without decoding, while preserving the optimal storage overhead of MDS codes. Furthermore, Carousel codes can achieve the optimal network traffic to reconstruct an unavailable block. We have implemented a prototype of Carousel codes on Apache Hadoop. Our experimental results have demonstrated that Carousel codes can make MapReduce jobs finish with almost 50% less time and reduce data access latency significantly, with a comparable throughput in the encoding and decoding operations and no additional sacrifice of failure tolerance or the network overhead to reconstruct unavailable data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On Data Parallelism of Erasure Coding in Distributed Storage Systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Parallelism-Aware Locally Repairable Code for Distributed Storage Systems
Jun Li ... Baochun Li
-
Jun Li, et. al.Jun Li ... Baochun Li
01 Jul 2018
01 Jul 2018

Reference-Counter Aware Deduplication in Erasure-Coded Distributed Storage System
Tong Liu ... Shakeel Alibhai
-
Tong Liu, et. al.Tong Liu ... Shakeel Alibhai
01 Oct 2018
01 Oct 2018

Data Management in Erasure-Coded Distributed Storage Systems
Chiniah Aatish ... Mungur Avinash
-
Chiniah Aatish, et. al.Chiniah Aatish ... Mungur Avinash
01 May 2020
01 May 2020

Demand-Aware Erasure Coding for Distributed Storage Systems
Jun Li ... Baochun Li
IEEE Transactions on Cloud Computing | VOL. 9
Jun Li, et. al.Jun Li ... Baochun Li
01 Apr 2021
IEEE Transactions on Cloud Computing | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On Data Parallelism of Erasure Coding in Distributed Storage Systems

Abstract

Talk to us

Similar Papers