As the most widely used storage device today, hard disks are efficient and convenient, but the damage incurred in the event of a failure can be very significant. Therefore, early warnings before hard disk failure, allowing the stored content to be backed up and transferred in advance, can reduce many losses. In recent years, an endless stream of research on the prediction of hard disk failure prediction has emerged. The detection accuracy of various methods, from basic machine learning models, such as decision trees and random forests, to deep learning methods, such as BP neural networks and recurrent neural networks, has also been improving. In this paper, based on the idea of blending ensemble learning, a novel failure prediction method combining machine learning algorithms and neural networks is proposed on the publicly available BackBlaze hard disk datasets. The failure prediction experiment is conducted only with S.M.A.R.T., that is, the learned characteristics collected by self-monitoring analysis and reporting technology, which are internally counted during the operation of the hard disk. The experimental results show that this ensemble learning model is able to outperform other independent models in terms of evaluation criterion based on the Matthews correlation coefficient. Additionally, through the experimental results on multiple types of hard disks, an ensemble learning model with high performance on most types of hard disks is found, which solves the problem of the low robustness and generalization of traditional machine learning methods and proves the effectiveness and high universality of this method.
Read full abstract