An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems

Ningjing Liang,Hailong Yang,Xiaoshe Dong,Xingjun Zhang,Changjiang Zhang

doi:10.1109/access.2020.3012190

Ningjing Liang, Hailong Yang + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.3012190

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 33	License type: CC BY 4.0

Affiliation: Xi'an Jiaotong University, Beihang University

Abstract

To reduce the storage cost, distributed storage systems are gradually using erasure codes to ensure data reliability. Liberation codes, which satisfy the maximum distance separable (MDS) property and provide optimal modification overhead, are a class of popular two fault tolerant erasure codes. However, erasure codes need to read from surviving nodes and transfer across the network large amounts of data when recovering from single node failures. Existing single node failure recovery approaches for Liberation codes are either time-consuming or suboptimal. In this article, firstly, we prove the minimum number of symbols required to recover one failed node for a Liberation coded system. Then we derive the conditions that optimal recovery solutions need to satisfy. Finally, we propose an algorithm, called Disk Read Optimal Recovery (DROR), which can determine an optimal recovery solution in linear time and recover the failed node reading the minimum amount of data. We have implemented DROR in a real-world storage system Ceph and evaluated DROR on a cluster of Amazon EC2 instances. We show that DROR reduces the reconstruction time by up to 23.6% compared to that of the recovery approach in Ceph.

Highlights

Inexpensive components are preferred for use in modern distributed storage systems due to the economic benefits; these components are less reliable, and data may become temporarily or permanently unavailable
2) We propose a recovery algorithm called Disk Read Optimal Recovery (DROR), which reaches the lower bound of disk read and decreases almost 25% of the disk read in theory compared with that of the conventional approach
We study the problem of minimizing the number of symbols read from surviving nodes when repairing an erased data node in Liberation coded storage systems

Summary

INTRODUCTION

Inexpensive components are preferred for use in modern distributed storage systems due to the economic benefits; these components are less reliable, and data may become temporarily or permanently unavailable. N. Liang et al.: Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems can be recovered by copying any one surviving replica. Liang et al.: Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems can be recovered by copying any one surviving replica This k-factor increases both in disk I/O1 and network traffic result in a long recovery time, which may seriously affect the system service performance. The consensus for storage systems is that two-failure tolerance is the right level of tolerance, assuming that data stripes are not large Our work supports this trend, we are concerned with one kind of MDS RAID-6 codes — Liberation codes and investigate their recovery performance in distributed storage systems.

BACKGROUND

MATRIX-VECTOR DEFINITION

TWO-DIMENSIONAL ARRAY DESCRIPTION

READ-OPTIMAL RECOVERY SEQUENCES

READ-OPTIMAL RECOVERY ALGORITHM

RESULTS

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A construction of (5,3) MDS codes with optimal repair capability for distributed storage systems
Sheng Guan ... Xin Wang
-
Sheng Guan, et. al.Sheng Guan ... Xin Wang
01 Oct 2017
01 Oct 2017

BASIC Codes for Distributed Storage Systems
Hanxu Hou ... Yunghsiang S Han
-
Hanxu Hou, et. al.Hanxu Hou ... Yunghsiang S Han
01 Jul 2017
01 Jul 2017

Practical Single Node Failure Recovery Using Fractional Repetition Codes in Data Centers
May Itani ... Islam Elkabbani
-
May Itani, et. al.May Itani ... Islam Elkabbani
01 Mar 2016
01 Mar 2016

Repair Optimal Erasure Codes Through Hadamard Designs
Dimitriss Papailiopoulos ... Viveck R Cadambe
IEEE Transactions on Information Theory | VOL. 59
Dimitriss Papailiopoulos, et. al.Dimitriss Papailiopoulos ... Viveck R Cadambe
01 May 2013
IEEE Transactions on Information Theory | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access