Abstract
The need for huge storage archives rises with the ever growing creation of data. With today’s big data and data analytics applications, some of these huge archives become active in the sense that all stored data can be accessed at any time. Running and evolving these archives is a constant tradeoff between performance, capacity, and price. We present the LoneStar RAID, a disk-based storage architecture, which focuses on high reliability, low energy consumption, and cheap reads. It is designed for MAID systems with up to hundreds of disk drives per server and is optimized for “write once, read sometimes” workloads. We use dedicated data and parity disks, and export the data disks as individually accessible buckets. By intertwining disk groups into a two-dimensional RAID and improving single-disk reliability with intradisk redundancy, the system achieves an elastic fault tolerance that can at least recover all 3-disk failures. Furthermore, we integrate a cache to offload parity updates and a journal to track the RAID’s state. The LoneStar RAID scheme provides a mean time to data loss (MTTDL) that competes with today’s erasure codes and is optimized to require only a minimal set of running disk drives.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have