Abstract

Due to the continuous upscaling of the storage capacity and downscaling of the physical size of storage systems, the data reliability problem has become a research highlight. While various redundancy schemes have been proposed to include extra redundant data to ensure that the correct data can be recovered even in the presence of bit errors or bad blocks, they might introduce unexpected performance overheads on data accesses. In particular, in the case of unreliable block devices, as a number of data blocks are grouped together for redundant data computation, even reading a single bad block might require to read out the whole set, including the data blocks and their redundant blocks, to recover the correct data, thereby considerably amplifying the read traffic. On the other hand, the replication scheme does not have such a serious overhead of read amplification, but introduces a serious space overhead to keep multiple copies of the same data. In this work, we propose to integrate redundancy and replication schemes according to the access patterns of different data, which can strike a proper balance between the space and performance overheads. The proposed scheme, “redundant or replicated data (R2D),” is then verified through experimental studies, where the results are quite encouraging.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call