Abstract

NAND based solid state storage devices are almost ubiquitously used in safety-critical embedded devices, and recent advances have demonstrated RAID architectures specific to solid state storage devices resulting in increased data reliability, with architectural enhancements to solve the age convergence problem. However, these techniques require devices to be taken off-line while components are replaced—consequently these devices are of limited use in hard real time systems. There are further real time issues in that the conventional architectures ignore other characteristics of solid state devices such as garbage collection and meta data management. In this paper we investigate techniques that support the replacement of aged devices in the array in such a way that we provide continuous system reliability. We also improve the performance overhead of the reconstruction process using a novel data migration policy. The techniques are implemented and tested in a trace-driven simulator, and results demonstrate that average I/O response time is improved by up to 39% with improvement by up to 45% in its standard deviation, overheads in terms of device replacement time are negligible, and read performance is improved by an average of 8%.

Highlights

  • Many embedded systems, including those that are safety critical, have to observe strict constraints in terms of shock resistance, energy consumption, physical size, and other factors

  • multiple level cell (MLC) do not lend themselves to ECC techniques as the size of meta data areas is limited

  • The view of the overall Redundant Array of Independent Disk (RAID) array is described by the pseudo code data structure in Data structure 2, and consists of a note of the type of the array (RAID-4, RAID-5, Diff RAID, or semi hybrid), the number of devices in the array, the erasure limit threshold for the devices in the array, a boolean variable indicating if the device replacement process is currently active, an array of metadata structures describing the state of each device in the array, and the stripe mapping table which contains details of all stripes of data stored in the system

Read more

Summary

Introduction

Many embedded systems, including those that are safety critical, have to observe strict constraints in terms of shock resistance, energy consumption, physical size, and other factors. The first is an uneven parity distribution— which refers to ratios of the parity data across devices of the array—that ensures erases across components are distributed unevenly, and the second is a device copy/swap algorithm that moves data around and manages lifespan as components reach endurance limits The limitation of this architecture is that this copy/swap operation requires the array to be taken off line and so whilst this mechanism significantly enhances reliability, it restricts is usage in hard real time systems as it is not able to serve requests during the component replacement period.

Motivation and Related Work
Architectural Design
Coordinated Data Migration
Cost Effective Parity Redistribution
Semi-Hybrid RAID
16: We are not using a semi hybrid configuration
Performance Evaluation
Findings
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call