Abstract

The storage system often uses erasure coding to provide the necessary fault tolerance. The erasure-coded update involves data transmission and data calculation of multiple nodes. Frequent updates will cause massive communication overhead. This paper mainly considers two issues: (1) In the scenario of frequent small-size updates, there are repetitive behaviors upon update, which causes bandwidth consumption to increase exponentially as the number of update nodes increases; (2) with the increase of data scale, there are local link busy phenomena caused by unbalanced use of links during update, which can prone to bottleneck links. In order to improve the inefficient update due to network bottlenecks. We propose SDCUP, a software-defined-control collaborative update mechanism that reduce update time for erasure-coded data with network load balance. Specially, SDCUP uses the software-defined control method to select the update transmission path according to the actual link load and adjust the data flow transmission rate by monitoring the degree of network load balance periodically. To further reduce the cross-rack update traffic, SDCUP unloads the calculation to the switch to realize the data aggregation in the rack, and it parallelizes sub-update operations to efficiently and cooperatively update. To evaluate the performance of SDCUP, we conduct simulation experiments on Mininet with real-world traces. The simulation results show that SDCUP can achieve a better load balance in multiple scenarios. Compared with the other data update schemes, the proposed method can improve the system throughput by up to 21% and reduce the update time by up to 47%.

Highlights

  • Based on the comprehensive trade-off of storage cost, bandwidth consumption, system load, and other factors, data center storage typically use erasure coding as a redundancy mechanism [1]

  • RS(n, k) code divides the original data into k data blocks, which are stored in the data nodes. k data blocks are encoded to form m(m = n − k) parity blocks, which are stored in parity nodes

  • The design idea of SDCUP is to group update requests that frequently come, select idle paths to transmit the update groups, and schedule the update groups according to the degree of network load balance

Read more

Summary

Introduction

Based on the comprehensive trade-off of storage cost, bandwidth consumption, system load, and other factors, data center storage typically use erasure coding as a redundancy mechanism [1]. Erasure-coded updates need to consume CPU resources for encode calculations and network bandwidth for devices interaction, updated blocks download, and data blocks transfer. The performance bottleneck of erasure-coded updates is mainly concentrated on network resources. To deploy erasure coding in data centers, existing approaches [9], [23], [24], [32] mostly adopt hierarchical node placement by placing n nodes on r racks (r < n). We consider erasure-coded storage in a data center, and the bandwidth of its intra-rack and cross-rack links varies greatly. The encoding and decoding computation of erasure coding usually brings high cross-rack transfer overhead, which results in performance bottleneck of data update. Our goal is exploit the property of hierarchical node placement to minimize the cross-rack traffic triggered by erasure-coded update operations

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.