Abstract

Although many hard drive failure prediction methods utilize Self-Monitoring Analysis and Reporting Technology (SMART) features, they are not collected in IT systems with demanding performance requirements to meet their specification. We present a novel data-driven method for the prediction utilizing disk-level performance metrics collected by Redundant Array of Independent Disk (RAID) controllers instead of SMART features. The proposed method computes relational anomaly scores leveraging logical relationships of Hard Disk Drives (HDDs) based on RAID configuration for better failure prediction. In addition, it further utilizes error codes from HDDs to filter out false positives. We evaluate the proposed method on a real-world dataset collected for this study from 881 disks used in disk arrays of RAID-6 and 1660 disks used in disk arrays of RAID-10 in a data center. The results show consistent performance improvement by the logical relationships and error-code-based filtering. In addition, seven out of nine failures are predicted one day before the failure at the latest. This result suggests that the proposed method provides plenty of time for HDD replacement before a failure occurs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.