Abstract

To ensure the reliability and security of the data, the large-scale distributed storage system usually adopts the data redundancy mechanism to repair the data on the faulty nodes. Comparing with the replication-based method, the data redundancy mechanism of erasure code can effectively improve the use of storage space, but it may result in the large network overhead when recovering the data. The regenerating code is an improved erasure code, which can reduce the quantity of data transmission compared to that of the erasure code. Adopting the regenerating code to repair the data on a faulty node requires constructing an optimal repair tree to maximum the bandwidth of the bottleneck link, which is an NP-hard problem. To construct the optimal repair tree, a hybrid genetic algorithm is proposed in this paper. In particular, our proposal comprehensively considers the network topology and link bandwidth of storage nodes and designs a problem-specific cross-correlation operator, mutation operator and local search operator. In addition, we provided the mathematical proof of the global convergence with probability one with respect to the proposed hybrid genetic algorithm. Through a series of simulation experiments, the results show that our proposal is able to determine the optimal repair tree, which effectively reduces the delay of faulty node repair in distributed storage systems, and improve the repairing efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call