Abstract

Failure prediction in a population of hard drives remains challenging due to the extreme class imbalance. Many existing failure prediction methods rely on problematic resampling approaches. Besides, existing deep learning methods in the failure prediction domain typically employ the sigmoid cross-entropy loss function which is not capable of solving the class imbalance problem. To address these challenges, this study proposes long-short term memory networks that leverage the sequential time series self monitoring analysis and reporting technology sensors data provided by the Backblaze data center. These models employ a new, modified focal loss function combined with weighted cross-entropy loss to tackle the negative influence of the class imbalance during training the long-short term memory networks. The long-short term memory networks models are compared to two traditional machine learning algorithms: random forest and logistic regression. This study focuses on daily snapshot data of the Seagate ST4000DM000 hard drive model from the Backblaze data center to maintain the homogeneity of the SMART attributes. Long-short term memory networks with modified focal loss function model provides the highest geometric mean score of 0.786±0.011 while keeping the failure detection rate higher without compromising the false alarm rate. Hence, it demonstrates that the proposed focal loss function in long-short term memory networks leads to improved performance compared to other approaches under the class imbalance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call