Optimal recovery of single disk failure in RDP code storage systems

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Modern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single disk failure only, while recent advanced coding techniques such as row-diagonal parity (RDP) can provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery (or rebuild) will be carried out. We show that conventional recovery scheme of RDP code for a single disk failure is inefficient and suboptimal. In this paper, we propose an optimal and efficient disk recovery scheme, Row-Diagonal Optimal Recovery (RDOR), for single disk failure of RDP code that has the following properties: (1) it is read optimal in the sense that it issues the smallest number of disk reads to recover the failed disk; (2) it has the load balancing property that all surviving disks will be subjected to the same amount of additional workload in rebuilding the failed disk. We carefully explore the design state space and theoretically show the optimality of RDOR. We carry out performance evaluation to quantify the merits of RDOR on some widely used disks.

Similar Papers
  • Research Article
  • Cite Count Icon 60
  • 10.1145/1811099.1811054
Optimal recovery of single disk failure in RDP code storage systems
  • Jun 12, 2010
  • ACM SIGMETRICS Performance Evaluation Review
  • Liping Xiang + 3 more

Modern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single disk failure only, while recent advanced coding techniques such as row-diagonal parity (RDP) can provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery (or rebuild) will be carried out. We show that conventional recovery scheme of RDP code for a single disk failure is inefficient and suboptimal. In this paper, we propose an optimal and efficient disk recovery scheme, Row-Diagonal Optimal Recovery (RDOR), for single disk failure of RDP code that has the following properties: (1) it is read optimal in the sense that it issues the smallest number of disk reads to recover the failed disk; (2) it has the load balancing property that all surviving disks will be subjected to the same amount of additional workload in rebuilding the failed disk. We carefully explore the design state space and theoretically show the optimality of RDOR. We carry out performance evaluation to quantify the merits of RDOR on some widely used disks.

  • Research Article
  • Cite Count Icon 64
  • 10.1145/2027066.2027071
A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes
  • Oct 1, 2011
  • ACM Transactions on Storage
  • Liping Xiang + 5 more

The current parallel storage systems use thousands of inexpensive disks to meet the storage requirement of applications. Data redundancy and/or coding are used to enhance data availability, for instance, Row-diagonal parity (RDP) and EVENODD codes, which are widely used in RAID-6 storage systems, provide data availability with up to two disk failures . To reduce the probability of data unavailability, whenever a single disk fails, disk recovery will be carried out. We find that the conventional recovery schemes of RDP and EVENODD codes for a single failed disk only use one parity disk. However, there are two parity disks in the system, and both can be used for single disk failure recovery. In this article, we propose a hybrid recovery approach that uses both parities for single disk failure recovery, and we design efficient recovery schemes for RDP code (RDOR-RDP) and EVENODD code (RDOR-EVENODD). Our recovery scheme has the following attractive properties: (1) “ read optimality ” in the sense that our scheme issues the smallest number of disk reads to recover a single failed disk and it reduces approximately 1/4 of disk reads compared with conventional schemes; (2) “ load balancing property ” in that all surviving disks will be subjected to the same (or almost the same) amount of additional workload in rebuilding the failed disk. We carry out performance evaluation to quantify the merits of RDOR-RDP and RDOR-EVENODD on some widely used disks with DiskSim. The offline experimental results show that RDOR-RDP and RDOR-EVENODD outperform the conventional recovery schemes of RDP and EVENODD codes in terms of total recovery time and recovery workload on individual surviving disk. However, the improvements are less than the theoretical value (approximately 25%), as RDOR-RDP and RDOR-EVENODD change the disk access pattern from purely sequential to a more random one compared with their conventional schemes.

  • Research Article
  • Cite Count Icon 67
  • 10.1145/1629075.1629076
Higher reliability redundant disk arrays
  • Nov 1, 2009
  • ACM Transactions on Storage
  • Alexander Thomasian + 1 more

Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID) . RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by exclusive-ORing the corresponding blocks on surviving disks. RAID5 can mask a single disk failure, and it is vulnerable to data loss if a second disk failure occurs. The RAID5 rebuild process systematically reconstructs the contents of a failed disk on a spare disk, returning the system to its original state, but the rebuild process may be unsuccessful due to unreadable sectors. This has led to two disk failure tolerant arrays (2DFTs) , such as RAID6 based on Reed-Solomon (RS) codes. EVENODD, RDP (Row-Diagonal-Parity), the X-code, and RM2 (Row-Matrix) are 2DFTs with parity coding. RM2 incurs a higher level of redundancy than two disks, while the X-code is limited to a prime number of disks. RDP is optimal with respect to the number of XOR operations at the encoding, but not for short write operations. For small symbol sizes EVENODD and RDP have the same disk access pattern as RAID6, while RM2 and the X-code incur a high recovery cost with two failed disks. We describe variations to RAID5 and RAID6 organizations, including clustered RAID, different methods to update parities, rebuild processing, disk scrubbing to eliminate sector errors, and the intra-disk redundancy (IDR) method to deal with sector errors. We summarize the results of recent studies of failures in hard disk drives. We describe Markov chain reliability models to estimate RAID mean time to data loss (MTTDL) taking into account sector errors and the effect of disk scrubbing. Numerical results show that RAID5 plus IDR attains the same MTTDL level as RAID6, while incurring a lower performance penalty. We conclude with a survey of analytic and simulation studies of RAID performance and tools and benchmarks for RAID performance evaluation.

  • Research Article
  • Cite Count Icon 55
  • 10.1109/tc.2007.1041
Performance of Two-Disk Failure-Tolerant Disk Arrays
  • Jun 1, 2007
  • IEEE Transactions on Computers
  • Alexander Thomasian + 2 more

RAID5 disk arrays use the rebuild process to reconstruct the contents of a failed disk on a spare disk, but this process is unsuccessful if latent sector failures are encountered or a second disk failure occurs. The high cost of data loss has led to two-disk failure-tolerant (2DFT) arrays: RAID6, EVENODD, row-diagonal parity (RDP), and RM2. RAID6 uses Reed-Solomon (RS) codes, whereas the latter three use parity encoding. This paper is concerned with the performance from the viewpoint of disk accesses, which, with an appropriate choice of symbol sizes, is the same for RAID6, EVENODD, and RDP, rather than the computational cost (number of XOR operations). We compare the performance of 2DFTs with each other, as well as RAIDO and RAID5 in normal and degraded operating modes. We derive cost functions for processing discrete disk accesses. The mean response time can be obtained analytically with Poisson arrivals and first-come, first-served (FCFS) scheduling. A simulation is used for validation, calibrating the approximate fork-join response analysis, and shortest-access-time-first (SATF) scheduling. The response time for read requests in RAID6 and RM2 is higher than RAID5 and RAIDO and increases with the fraction of write requests. When there is a single disk failure, RM2 outperforms RAID6 since it has a smaller parity group size than RAID6, but RAID6 outperforms RM2 with two disk failures because of its costlier recovery process. Disk loads in RM2 and RAID6 in degraded mode are unbalanced, and a solution based on pseudorandom permutations is proposed for this purpose

  • Research Article
  • Cite Count Icon 65
  • 10.1109/tc.2013.8
Single Disk Failure Recovery for X-Code-Based Parallel Storage Systems
  • Apr 1, 2014
  • IEEE Transactions on Computers
  • Silei Xu + 6 more

In modern parallel storage systems (e.g., cloud storage and data centers), it is important to provide data availability guarantees against disk (or storage node) failures via redundancy coding schemes. One coding scheme is X-code, which is double-fault tolerant while achieving the optimal update complexity. When a disk/node fails, recovery must be carried out to reduce the possibility of data unavailability. We propose an X-code-based optimal recovery scheme called minimum-disk-read-recovery (MDRR), which minimizes the number of disk reads for single-disk failure recovery. We make several contributions. First, we show that MDRR provides optimal single-disk failure recovery and reduces about 25 percent of disk reads compared to the conventional recovery approach. Second, we prove that any optimal recovery scheme for X-code cannot balance disk reads among different disks within a single stripe in general cases. Third, we propose an efficient logical encoding scheme that issues balanced disk read in a group of stripes for any recovery algorithm (including the MDRR scheme). Finally, we implement our proposed recovery schemes and conduct extensive testbed experiments in a networked storage system prototype. Experiments indicate that MDRR reduces around 20 percent of recovery time of the conventional approach, showing that our theoretical findings are applicable in practice.

  • Research Article
  • Cite Count Icon 14
  • 10.1109/tpds.2015.2442979
Reconsidering Single Disk Failure Recovery for Erasure Coded Storage Systems: Optimizing Load Balancing in Stack-Level
  • May 1, 2016
  • IEEE Transactions on Parallel and Distributed Systems
  • Yingxun Fu + 3 more

The fast growing of data scale encourages the wide employment of data disks with large storage capacity. However, a mass of data disks' equipment will in turn increase the probability of data loss or damage, because of the appearance of various kinds of disk failures. To ensure the intactness of the hosted data, modern storage systems usually adopt erasure codes, which can recover the lost data by pre-storing a small amount of redundant information. As the most common case among all the recovery mechanisms, the single disk failure recovery has been receiving intensive attentions for the past few years. However, most of existing works still take the stripe-level recovery as their only consideration, and a considerable performance improvement on single failure disk reconstruction in the stack-level (i.e., a group of rotated stripes) is missed. To seize this potential improvement, in this paper we systematically study the problem of single failure recovery in the stack-level. We first propose two recovery mechanism based on greedy algorithm to seek for the near-optimal solution (BP-Scheme and STP-Scheme) for any erasure array code in stack level, and further design a rotated recovery algorithm (RR-Algorithm) to eliminate the size of required memory. Through a rigorous statistic analysis and intensive evaluation on a real system, the results show that BP-Scheme gains 3.4 to 38.9 percent (the average is 21.2 percent) higher recovery speed than Khan's Scheme and 3.4 to 34.8 percent (the average is 19.1 percent) higher recovery speed than Luo's U-Scheme, while STP-Scheme owns 3.4 to 46.9 percent (the average is 25.15 percent) and 3.4 to 41.1 percent (the average is 22.3 percent) higher recovery speed than Khan's Scheme and Luo's U-Scheme, respectively.

  • Book Chapter
  • Cite Count Icon 1
  • 10.5772/8870
Design of Simple and High Speed Scheme to Protect Mass Storages
  • Apr 1, 2010
  • Ming-Haw Jing + 4 more

The computer industry has entered a stage of unprecedented improvement in CPU performance. However, the speed of file system management of huge information is commonly considered as the main factor that affects the computer performance; for example, the I/O bandwidth is limited by magnetic disks. The capacity and cost of magnetic disks per megabyte have been continually improved, but the rotation speed and seek time are improved very slowly. Recently, many computers have become I/O bound in the applications of video, audio, commercial database, etc. If such an I/O crisis can be resolved, the computer system performance will be improved. In 1988, Patterson et al. proposed the redundant array of independent disks (RAID) system which allows the data to be separated into several disks (Patterson et al., 1988). We can access the data in parallel so that the throughput of I/O systems will be improved. On the other hand, more disks in RAID system have a higher risk of losing data because of high component failure rates. As a result, the safety and reliability have become the major issues in the RAID system. When designing a highly available and reliable RAID system, the method of bit wise parity checking is mostly used to correct errors and to enhance reliability of the RAID system. However, the parity checking method is limited so that only single disk failure can be tolerated. In 1995, Blaum et al. proposed a method called even-odd code, which tolerates up to two disk failures in the RAID system (Blaum et al., 1995). Even-odd code is the first known scheme for tolerating single or double disk failures, providing an optimal solution with regard to both storage and performance. However, the major problem concerning the even-odd code is a variety of modes of operations when solving erasures or up to 2 disk failures. In practical, it is not easy to be integrated into a VSLI. On the other hand, a small write problem is difficult to be solved with the even-odd code (Liao & Jing, 2002). In 1997, Plank proposed a tutorial by using the Reed-Solomon (RS) code to provide error correction in the RAID system (Plank, 1997). In 2000, Jing et al. also proposed a simple algorithm, called RS-RAID system, to combine the RS codes with the RAID system (Jing et al. 2000). In this chapter, we aim to improve RS codes codec to design a fast error and erasure correction for RS-RAID system, and to solve the small write problem in RS codes. In a RS decoder, there are various algorithms to solve the error locator polynomial, which affect the 6

  • Conference Article
  • Cite Count Icon 19
  • 10.1109/srds.2014.29
A Stack-Based Single Disk Failure Recovery Scheme for Erasure Coded Storage Systems
  • Oct 1, 2014
  • Yingxun Fu + 2 more

The fast growing of data scale encourages the wide employment of data disks with large storage capacity. However, a mass of data disks' equipment will in turn increase the probability of data loss or damage, because of the appearance of various kinds of disk failures. To ensure the intactness of the hosted data, modern storage systems usually adopt erasure codes, which can recover the lost data by pre-storing a small amount of redundant information. As the most common case among all the recovery mechanisms, the single disk failure recovery has been receiving intensive attentions for the past few years. However, most of existing works in this literature still take the stripe-level recovery as their only consideration, and a considerable performance improvement on single failure disk reconstruction in the stack-level (i.e., a group of rotated stripes) is missed. To seize this potential improvement, in this paper we systematically study the problem of single failure recovery in the stack-level. We first propose our recovery mechanism based on greedy algorithm to seek for the near-optimal solution (BP-Scheme) for any erasure array code in stack level, and further design a rotated recovery algorithm (RR-Algorithm) to eliminate the size of required memory. Through a rigorous statistic analysis and intensive evaluation on a real system, the results show that BP-Scheme gains at most 38.9% higher recovery speed than Khan's Scheme, and owns up to 34.8% higher recovery speed than Luo's U-Scheme.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/iceiec.2018.8473543
L-Code: An Efficient Coding Scheme for Recovering Single Disk Failure
  • Jun 1, 2018
  • Wenbo Liu + 2 more

In modern storage system, the reliability of the data is very important. In order to deal with the disk failures, researchers put forward many methods. Among them, one important implementation is erasure code. In erasure code scheme, there exist many disk failure problems, the most common one is single disk failure, it has been receiving more attention in recent years. In this paper, we present an efficient erasure code scheme which we named L-code, through the different placement and calculation of redundant elements, this scheme can improve the performance on single disk failure reconstruction. To demonstrate it, we did some experiments, the result show that L-code gains up to 34.4% higher recovery performance than optimized H-code and 42.7% for optimized EVENODD. In encoding complexity, our scheme also gains up to 1.9% than optimized H-code and 44.2% than optimized EVENODD.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/prdc.2015.16
Combining Low IO-Operations During Data Recovery with Low Parity Overhead in Two-Failure Tolerant Archival Storage Systems
  • Nov 1, 2015
  • Thomas Schwarz + 2 more

Archival data storage systems contain data that must be preserved over long periods of time but which are often unlikely to be accessed during their lifetime. The best strategy for such systems is to keep their disks powered-off unless they have to be powered up to access their contents, to reconstruct lost data, or to perform other disk maintenance tasks. Of all such tasks, reconstructing data after a disk failure is the one that is likely to have the highest energy footprint and the most impact on the overall power consumption of the array, because it typically involves powering up all the disks belonging to the same reliability stripe as the failed disk and keeping them running for considerable time at each occurrence. We investigate two two-failure tolerant disk layouts that have lower parity overhead than the number of disks read (and hence powered-on) for recovering data on lost drives would suggest. Our first organization is a flat XOR code that organizes the data disks into a rectangle with fewer rows than columns, and adds a simple parity disk to each row and column. Recovery from a disk failure proceeds by prefering columns when reconstructing lost data, and thereby has fewer reads than the parity overhead would normally suggest. Our second layout is based on the most basic pyramid code. We can view this layout as an example RAID Level 6 variant. In this variant, a stripe has a Q-parity calculated from the data disks in the stripe, but the data disks are also organized into smaller groups where each group has a separate P-parity calculated as the exclusive-or of the data disks in the group. We compare the two layouts by measuring their robustness to data loss, their one-year survival rate, and the expected number of number of disks that must be involved to recover from both single and multiple disk failures. Our results show that rectangular layouts are significantly more reliable than layouts based on the most basic Pyramid codes, but that they also require more disk accesses to recover from disk failures.

  • Book Chapter
  • Cite Count Icon 13
  • 10.1007/978-3-540-49823-0_33
Self-adaptive Disk Arrays
  • Jan 1, 2006
  • Jehan-François Pâris + 2 more

We present a disk array organization that adapts itself to successive disk failures. When all disks are operational, all data are mirrored on two disks. Whenever a disk fails, the array reorganizes itself, by selecting a disk containing redundant data and replacing these data by their exclusive or (XOR) with the other copy of the data contained on the disk that failed. This will protect the array against any single disk failure until the failed disk gets replaced and the array can revert to its original condition. Hence data will remain protected against the successive failures of up to one half of the original number of disks, provided that no critical disk failure happens while the array is reorganizing itself. As a result, our scheme achieves the same access times as a mirrored organization under normal operational conditions while having a much lower likelihood of loosing data under abnormal conditions. In addition it tolerates much longer repair times than mirrored disk arrays.

  • Book Chapter
  • Cite Count Icon 8
  • 10.1007/978-981-13-0514-6_24
RAID-6 Code Variants for Recovery of a Failed Disk
  • Aug 22, 2018
  • M P Ramkumar + 3 more

With the increasing demand for capacity, speed, and reliability in large-scale storage systems, a mechanism should exist to ensure the data availability. Though there exist kinds of erasure code implementations in RAID-6, maximum distance separable (MDS) codes provide simple yet better way of data protection and recovery mechanism in the course of a disk failure. RAID-6 is preferred due to the capability of fault tolerance against simultaneous two disk failures. In addition to the provisioning of fault tolerance against disk failures, it is also necessary to concentrate on recovery to avoid data unavailability. Even though the RAID-6 supports two disk failures, when a number of disk failures are more than its parity, data will be lost. Hence, it is important to address the single disk failure and recover the failed disk at the earliest. The early recovery of a failed disk (i.e., recovery time) depends on the number of overlapping blocks. The hybrid code achieves the optimal recovery time than the other categories by consuming 22% of reused blocks.

  • Conference Article
  • 10.1109/pacrim.1997.619956
A cost-effective solution to the single disk failure in RAID architecture
  • Aug 20, 1997
  • Sanghoon Jeon + 1 more

It is very important to recover data immediately at a single disk failure for critical applications such as multimedia storage systems, real-time systems and so on. As an efficient solution, the paper proposes a hybrid scheme to improve the performance before a failed disk is replaced with a new disk and it does not require an additional disk to recover data. The hybrid scheme is evaluated and its performance is analyzed for various request sizes using the simulation. The results show that the performance of the hybrid scheme is improved by up to 85% compared with that of RAID level 5 in the reconfigured mode.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.ipl.2011.03.001
X-code double parity array operation with two disk failures
  • Mar 16, 2011
  • Information Processing Letters
  • Alexander Thomasian + 1 more

X-code double parity array operation with two disk failures

  • Conference Article
  • Cite Count Icon 6
  • 10.1117/12.264285
<title>Fault-tolerant video server using combined RAID 5 and mirroring</title>
  • Jan 24, 1997
  • Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
  • Ernst W Biersack + 1 more

Video servers must use large disk arrays to provide the huge amount of storage capacity and bandwidth needed. As the number of disk drives increases, the probability of a video server failure increases too. We propose a redundancy scheme that uses both RAID 5 techniques and mirroring to make a video server tolerant against all single disk failures. Our approach provides a unified framework for the use of RAID 5 and mirroring to achieve fault tolerance at the lowest additional cost possible, while guaranteeing 100 percent service availability even when operating with a failed disk.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant