Abstract
The repair problem in distributed storage addresses recovery of the data encoded using an erasure code, for instance, a Reed–Solomon (RS) code. We consider the problem of repairing a single node or multiple nodes in RS-coded storage systems using the smallest possible amount of inter-nodal communication. According to the cut-set bound, communication cost of repairing $h\geqslant 1$ failed nodes for an $(n,k=n-r)$ maximum distance separable (MDS) code using $d$ helper nodes is at least $dhl/(d+h-k)$ , where $l$ is the size of the node. Guruswami and Wootters (2016) initiated the study of efficient repair of RS codes, showing that they can be repaired using a smaller bandwidth than under the trivial approach. At the same time, their work as well as follow-up papers stopped short of constructing RS codes (or any scalar MDS codes) that meet the cut-set bound with equality. In this paper, we construct the families of RS codes that achieve the cut-set bound for repair of one or several nodes. In the single-node case, we present the RS codes of length $n$ over the field ${\mathbb F}_{q^{l}}, l=\exp ((1+o(1))n\log n)$ that meet the cut-set bound. We also prove an almost matching lower bound on $l$ , showing that super-exponential scaling is both necessary and sufficient for scalar MDS codes to achieve the cut-set bound using linear repair schemes. For the case of multiple nodes, we construct a family of RS codes that achieve the cut-set bound universally for the repair of any $h=1,2, {\dots },r$ failed nodes from any subset of $d$ helper nodes, $k\leqslant d\leqslant n-h$ . For a fixed number of parities $r$ , the node size of the constructed codes is close to the smallest possible node size for codes with such properties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have