Abstract

The repair problem in distributed storage addresses recovery of the data encoded using an erasure code, for instance, a Reed–Solomon (RS) code. We consider the problem of repairing a single node or multiple nodes in RS-coded storage systems using the smallest possible amount of inter-nodal communication. According to the cut-set bound, communication cost of repairing $h\geqslant 1$ failed nodes for an $(n,k=n-r)$ maximum distance separable (MDS) code using $d$ helper nodes is at least $dhl/(d+h-k)$ , where $l$ is the size of the node. Guruswami and Wootters (2016) initiated the study of efficient repair of RS codes, showing that they can be repaired using a smaller bandwidth than under the trivial approach. At the same time, their work as well as follow-up papers stopped short of constructing RS codes (or any scalar MDS codes) that meet the cut-set bound with equality. In this paper, we construct the families of RS codes that achieve the cut-set bound for repair of one or several nodes. In the single-node case, we present the RS codes of length $n$ over the field ${\mathbb F}_{q^{l}}, l=\exp ((1+o(1))n\log n)$ that meet the cut-set bound. We also prove an almost matching lower bound on $l$ , showing that super-exponential scaling is both necessary and sufficient for scalar MDS codes to achieve the cut-set bound using linear repair schemes. For the case of multiple nodes, we construct a family of RS codes that achieve the cut-set bound universally for the repair of any $h=1,2, {\dots },r$ failed nodes from any subset of $d$ helper nodes, $k\leqslant d\leqslant n-h$ . For a fixed number of parities $r$ , the node size of the constructed codes is close to the smallest possible node size for codes with such properties.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call