Abstract

Latent sector errors in disk drives affect only a few data sectors. They occur silently and are detected only when the affected area is accessed again. If a latent error is detected while the storage system is operating under reduced redundancy, i.e., during a RAID rebuild, then data loss may occur. Various features such as scrubbing and intra-disk data redundancy are proposed to detect and/or recover from latent errors and avoid data loss. While such features enhance data availability in the storage system, their execution may cause performance degradation. In this paper, we evaluate the effectiveness of scrubbing and intra-disk data redundancy in improving data availability while the overall goal is to maintain user performance within predefined bounds. We show that by treating them as low priority background activities and scheduling them efficiently during idle times, these features remain performance-wise transparent to the storage system user while still improving data reliability. Detailed trace-driven simulations show that the mean time to data loss (MTTDL) improves by up to 5 orders of magnitude if these features are implemented independently. By scheduling concurrently both scrubbing and intra-disk parity updates during idle times in disk drives, MTTDL improves by as much as 8 orders of magnitude.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.