Abstract

Background: Cloud services have become a popular approach for offering efficient services for a wide range of activities. Predicting hardware failures in a cloud data center can minimize downtime and make the system more reliable and fault-tolerant. Objective: This research aims to analyze a predictive hardware failure model based on machine learning that anticipates the required remediations for undiagnosed failures in a cloud computing system serving multiclass requests. Methods: The model is tested on a carefully designed cloud data center that categorizes incoming requests as web, compute, storage, and dedicated server requests. To demonstrate improved reliability, a carefully designed test case is run on ReliaCloud-NS, which is a simulator for creating a CCS and computing its reliability. Results: The work found that using this model considerably enhanced the reliability of cloud computing systems when compared to not using the model. Conclusion: Although various estimation methods are patented to evaluate the system reliability of a cloud computing network, the emphasis of this study was mostly on improving the reliability of request-segregated clouds upon failing hardware resources like CPU, memory, bandwidth, and hard disc. Moreover, the prediction model might potentially be expanded to other system resources such as GPUs, software, and database packages.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call