Abstract
Distributed storage systems use network coding techniques like replication, erasure codes, local codes, regeneration codes, hybrid code, double code and group repair code to store data efficiently and provide speedy recovery of data during failures. The performance of these approaches is mainly compared on the basis of storage required and repair bandwidth. Out of these, Group Repair Codes is the one that has optimal repair bandwidth for regeneration of nodes. Traditionally, the cost of regeneration was considered to be dependent on the number of nodes participating in the process and the amount of data being transferred. There was not much discussion on the heterogeneity of the network and the capacity of the links between the nodes. In real-time, the nodes are connected to each other with different link capacities due to which the same amount of data takes different duration in reaching its destination. Selecting the node with higher link capacity helps in reducing the data transfer time. So, considering the heterogeneous nature of the network, this paper reduces the regeneration time for Group Repair Codes (GRC). The node selection algorithms for data regeneration have been proposed for GRC and the results of simulation show significant improvement in the regeneration time. Further, the network coding in heterogeneous systems may be explored for factors like network traffic, intermediate nodes, data routing etc.
Highlights
Network coding has been widely used for storing the data in distributed and cloud storage systems to provide 100% availability of data
These were used in Google file system (Ghemawat et al, 2003), OceanStore
Researchers (Wang et al, 2014) had proposed node selection algorithms for selecting a newcomer node and provider nodes from a set of nodes for heterogeneous distributed storage systems to optimize the cost of repair
Summary
Network coding has been widely used for storing the data in distributed and cloud storage systems to provide 100% availability of data. Researchers (Wang et al, 2014) had proposed node selection algorithms for selecting a newcomer node and provider nodes from a set of nodes for heterogeneous distributed storage systems to optimize the cost of repair. The newcomer and provider selection has been investigated by different authors to optimize the cost of repair and minimize the regeneration time in the network that has different link capacities between the nodes for transferring the data (Gong et al, 2015; Jia et al, 2015; 2016). A hybrid genetic algorithm to determine the optimal repair tree was proposed by Ye et al (2021) that considered the network topology and link bandwidth of storage nodes to reduce the delay of faulty node repair. The Group Repair Codes (GRC) have the minimum repair bandwidth and require less storage space, this paper focuses on further optimizing the data regeneration under heterogeneous network.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Mathematical, Engineering and Management Sciences
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.