Abstract
Identifying and anticipating potential failures in the cloud is an effective method for increasing cloud reliability and proactive failure management. Many studies have been conducted to predict potential failure, but none have combined SMART (self-monitoring, analysis, and reporting technology) hard drive metrics with other system metrics, such as central processing unit (CPU) utilisation. Therefore, we propose a combined system metrics approach for failure prediction based on artificial intelligence to improve reliability. We tested over 100 cloud servers’ data and four artificial intelligence algorithms: random forest, gradient boosting, long short-term memory, and gated recurrent unit, and also performed correlation analysis. Our correlation analysis sheds light on the relationships that exist between system metrics and failure, and the experimental results demonstrate the advantages of combining system metrics, outperforming the state-of-the-art.
Highlights
In this study, taking advantage of the advancement in artificial intelligence (AI), we focus on failure prediction based on AI techniques of random forest (RF), gradient boosting (GB), long short-term memory (LSTM), and gated recurrent unit (GRU)
The aim of this study is to improve the reliability of cloud services by improving cloud server failure prediction, using selected AI techniques and the combined system metrics approach
We hypothesised that combining multiple system metrics would improve failure prediction, and in this paper, we present our research on cloud server failure prediction
Summary
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cloud computing has emerged as the fifth utility over the last decade, and is a backbone to the modern economy [1]. It is a model of computing that allows flexible use of virtual servers, massive scalability, and management services for the delivery of information services. With the low-cost pay-per-use model of on-demand computing [2], the cloud has grown massively over the years, both in terms of size and complexity
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have