Reliability and performance enhancements for SSD RAID

Alistair A Mcewan,Muhammed Ziya Komsul

doi:10.1016/j.micpro.2016.11.012

Alistair A Mcewan, Muhammed Ziya Komsul

Open Access

https://doi.org/10.1016/j.micpro.2016.11.012

Copy DOI

Journal: Microprocessors and Microsystems	Publication Date: Nov 18, 2016
Citations: 4	License type: cc-by

Affiliation: University of Leicester

Abstract

NAND based solid state storage devices are almost ubiquitously used in safety-critical embedded devices, and recent advances have demonstrated RAID architectures specific to solid state storage devices resulting in increased data reliability, with architectural enhancements to solve the age convergence problem. However, these techniques require devices to be taken off-line while components are replaced—consequently these devices are of limited use in hard real time systems. There are further real time issues in that the conventional architectures ignore other characteristics of solid state devices such as garbage collection and meta data management. In this paper we investigate techniques that support the replacement of aged devices in the array in such a way that we provide continuous system reliability. We also improve the performance overhead of the reconstruction process using a novel data migration policy. The techniques are implemented and tested in a trace-driven simulator, and results demonstrate that average I/O response time is improved by up to 39% with improvement by up to 45% in its standard deviation, overheads in terms of device replacement time are negligible, and read performance is improved by an average of 8%.

Highlights

Many embedded systems, including those that are safety critical, have to observe strict constraints in terms of shock resistance, energy consumption, physical size, and other factors
multiple level cell (MLC) do not lend themselves to ECC techniques as the size of meta data areas is limited
The view of the overall Redundant Array of Independent Disk (RAID) array is described by the pseudo code data structure in Data structure 2, and consists of a note of the type of the array (RAID-4, RAID-5, Diff RAID, or semi hybrid), the number of devices in the array, the erasure limit threshold for the devices in the array, a boolean variable indicating if the device replacement process is currently active, an array of metadata structures describing the state of each device in the array, and the stripe mapping table which contains details of all stripes of data stored in the system

Summary

Introduction

Many embedded systems, including those that are safety critical, have to observe strict constraints in terms of shock resistance, energy consumption, physical size, and other factors. The first is an uneven parity distribution— which refers to ratios of the parity data across devices of the array—that ensures erases across components are distributed unevenly, and the second is a device copy/swap algorithm that moves data around and manages lifespan as components reach endurance limits The limitation of this architecture is that this copy/swap operation requires the array to be taken off line and so whilst this mechanism significantly enhances reliability, it restricts is usage in hard real time systems as it is not able to serve requests during the component replacement period.

Motivation and Related Work

Architectural Design

Coordinated Data Migration

Cost Effective Parity Redistribution

Semi-Hybrid RAID

16: We are not using a semi hybrid configuration

Performance Evaluation

Findings

Conclusions and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reliability and performance enhancements for SSD RAID

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microprocessors and Microsystems

Lead the way for us

Similar Papers

On-Line Device Replacement Techniques for SSD RAID
Alistair A Mcewan ... Muhammed Ziya Komsul
-
Alistair A Mcewan, et. al.Alistair A Mcewan ... Muhammed Ziya Komsul
01 Aug 2015
01 Aug 2015

Hybrid drive design: an economics - workload based approach
Joy Shukla
-
Joy ShuklaJoy Shukla
13 Nov 2015
13 Nov 2015

Robust error-management and impact of throughput in solid state storage — Characterization and first system reliability model
Jay Sarkar ... Yao Zhang
-
Jay Sarkar, et. al.Jay Sarkar ... Yao Zhang
01 Jan 2017
01 Jan 2017

Partitioned Real-Time NAND Flash Storage
Katherine Missimer ... Richard West
-
Katherine Missimer, et. al.Katherine Missimer ... Richard West
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reliability and performance enhancements for SSD RAID

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Microprocessors and Microsystems