Abstract

Distributed storage systems use network coding techniques like replication, erasure codes, local codes, regeneration codes, hybrid code, double code and group repair code to store data efficiently and provide speedy recovery of data during failures. The performance of these approaches is mainly compared on the basis of storage required and repair bandwidth. Out of these, Group Repair Codes is the one that has optimal repair bandwidth for regeneration of nodes. Traditionally, the cost of regeneration was considered to be dependent on the number of nodes participating in the process and the amount of data being transferred. There was not much discussion on the heterogeneity of the network and the capacity of the links between the nodes. In real-time, the nodes are connected to each other with different link capacities due to which the same amount of data takes different duration in reaching its destination. Selecting the node with higher link capacity helps in reducing the data transfer time. So, considering the heterogeneous nature of the network, this paper reduces the regeneration time for Group Repair Codes (GRC). The node selection algorithms for data regeneration have been proposed for GRC and the results of simulation show significant improvement in the regeneration time. Further, the network coding in heterogeneous systems may be explored for factors like network traffic, intermediate nodes, data routing etc.

Highlights

  • Network coding has been widely used for storing the data in distributed and cloud storage systems to provide 100% availability of data

  • These were used in Google file system (Ghemawat et al, 2003), OceanStore

  • Researchers (Wang et al, 2014) had proposed node selection algorithms for selecting a newcomer node and provider nodes from a set of nodes for heterogeneous distributed storage systems to optimize the cost of repair

Read more

Summary

Introduction

Network coding has been widely used for storing the data in distributed and cloud storage systems to provide 100% availability of data. Researchers (Wang et al, 2014) had proposed node selection algorithms for selecting a newcomer node and provider nodes from a set of nodes for heterogeneous distributed storage systems to optimize the cost of repair. The newcomer and provider selection has been investigated by different authors to optimize the cost of repair and minimize the regeneration time in the network that has different link capacities between the nodes for transferring the data (Gong et al, 2015; Jia et al, 2015; 2016). A hybrid genetic algorithm to determine the optimal repair tree was proposed by Ye et al (2021) that considered the network topology and link bandwidth of storage nodes to reduce the delay of faulty node repair. The Group Repair Codes (GRC) have the minimum repair bandwidth and require less storage space, this paper focuses on further optimizing the data regeneration under heterogeneous network.

Overview of the Group Repair Codes
Effect of Heterogeneity in Group Repair Code
System Design
Regeneration of Data
Result
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call