On the Distribution Modeling of Heavy-Tailed Disk Failure Lifetime in Big Data Centers

Suayb S Arslan,Engin Zeydan

doi:10.1109/tr.2020.3007127

Suayb S Arslan, Engin Zeydan

Open Access

https://doi.org/10.1109/tr.2020.3007127

Copy DOI

Abstract

It has become commonplace to observe frequent multiple disk failures in big data centers in which thousands of drives operate simultaneously. Disks are typically protected by replication or erasure coding to guarantee a predetermined reliability. However, in order to optimize data protection, real life disk failure trends need to be modeled appropriately. The classical approach to modeling is to estimate the probability density function of failures using nonparametric estimation techniques such as kernel density estimation (KDE). However, these techniques are suboptimal in the absence of the true underlying density function. Moreover, insufficient data may lead to overfitting. In this article, we propose to use a set of transformations to the collected failure data for almost perfect regression in the transform domain. Then, by inverse transformation, we analytically estimated the failure density through the efficient computation of moment generating functions, and hence, the density functions. Moreover, we developed a visualization platform to extract useful statistical information such as model-based mean time to failure. Our results indicate that for other heavy-tailed data, the complex Gaussian hypergeometric distribution and classical KDE approach can perform best if the overfitting problem can be avoided and the complexity burden is overtaken. On the other hand, we show that the failure distribution exhibits less complex Argus-like distribution after performing the Box-Cox transformation up to appropriate scaling and shifting operations.

Highlights

H ARD drives and more recent Solid State Drives (SSDs) have become the core/most common data storage units of today’s data centers
Since the reliability function R(t) is closely related to cumulative distribution function F (t) through the relationship R(t) = 1 − F (t), it is of interest to estimate the probability density function (PDF) of failures to be able to quantify the reliability of storage devices
We have studied the probabilistic modeling of real-life disk failure lifetime as well as analyzed the storage statistics

Summary

Introduction

H ARD drives and more recent Solid State Drives (SSDs) have become the core/most common data storage units of today’s data centers. These systems, that operate in close proximity and share the same geographical area, are affected by similar environmental factors, or the same hardware and network infrastructure, which increases the likelihood of these devices experiencing similar problems or undergoing close fault scenarios [1]. A hardware or network problem can cause multiple storage devices to fail or become unavailable simultaneously in the network. Since the reliability function R(t) is closely related to cumulative distribution function F (t) through the relationship R(t) = 1 − F (t), it is of interest to estimate the probability density function (PDF) of failures to be able to quantify the reliability of storage devices.

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Reliability	Publication Date: Jul 27, 2020
Citations: 4	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

On the Distribution Modeling of Heavy-Tailed Disk Failure Lifetime in Big Data Centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Reliability

Lead the way for us

Similar Papers

A visualization platfom for disk failure analysis
Ibrahim Onuralp Yigit ... Şuayb Ş Arslan
-
Ibrahim Onuralp Yigit, et. al.Ibrahim Onuralp Yigit ... Şuayb Ş Arslan
01 May 2018
01 May 2018

Development of a kernel density estimation with hybrid estimated bounded data
Young-Jin Kang ... O-Kaung Lim
Journal of Mechanical Science and Technology | VOL. 32
Young-Jin Kang, et. al.Young-Jin Kang ... O-Kaung Lim
01 Dec 2018
Journal of Mechanical Science and Technology | VOL. 32

GPU Acceleration of Mean Free Path Based Kernel Density Estimators for Monte Carlo Neutronics Simulations
Timothy Burke ... Forrest Brown
-
Timothy Burke, et. al.Timothy Burke ... Forrest Brown
19 Nov 2015
19 Nov 2015

Nonparametric Analysis in Accounting Research
Frank Murphy ... Stephanie Miller
SSRN Electronic Journal | VOL. -
Frank Murphy, et. al.Frank Murphy ... Stephanie Miller
19 Jun 2019
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Distribution Modeling of Heavy-Tailed Disk Failure Lifetime in Big Data Centers

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Reliability