Abstract

Erasure coding is the leading technique to achieve resilient redundancy in cloud storage systems. However, it introduces two prominent issues: data repair and data update. Compare to data repair, data update is much more common. A variety of update schemes based on erasure coding have been proposed in the literature to optimize data update, such as computation optimization, network traffic overhead reduction, IO overhead reduction, and modern hardware acceleration. However, all of these techniques were proposed individually previously. In this work, we seek to summarize them systematically and group them in a new form. First, we generalize the state-of-the-art researches and introduce existing classifications. Moreover, based on our observation, we propose two classifications: resource-based classification and tier-based classification. In resource-based classification, we group these techniques according to the resource they optimize and introduce them in detail. In tier-based classification, we propose a novel hybrid technique framework with five tiers and conduct a comprehensive comparison between these techniques. We make a conjecture that most techniques in different tiers can be used jointly. Finally, we conclude the research challenges and potential future works.

Highlights

  • It is estimated that approximately 3.6 billion users utilize cloud storage services in 2018 [53], with Dropbox alone claiming 500 million users in 2016 [20]

  • According to these reasons of data update (DU) in erasure coding, a variety of update schemes based on erasure coding have been proposed in the literature to optimize DU, including optimizing computation schedule [27], [41], traffic overhead reduction [39], [59], IO overhead reduction [50] and modern hardware acceleration [58]

  • Our work mainly makes the following contributions: G We summarize the state of the art on DU and introduce the existing classifications of DU in the literature

Read more

Summary

INTRODUCTION

It is estimated that approximately 3.6 billion users utilize cloud storage services in 2018 [53], with Dropbox alone claiming 500 million users in 2016 [20]. While Shen et al [50] have a different observation that update requests with large update sizes are quite common in existing distributed storage systems, especially for online applications [16], [32], [57]. Once a failure occurs during an update, an efficient update scheme with rollback-based strategy [59] or similar strategies should be considered According to these reasons (requirements) of DU in erasure coding, a variety of update schemes based on erasure coding have been proposed in the literature to optimize DU, including optimizing computation schedule [27], [41], traffic overhead reduction [39], [59], IO overhead reduction [50] and modern hardware acceleration [58].

RELATED WORK
Classification Method Data Transmission Approaches
COMPUTATION OPTIMIZATION
Findings
VIII. CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.