Abstract

This Failure prediction of high-performance computing clusters (HPCC) is a crucial issue and a hot problem for many years. Previous works have failed to provide a robust method for real-time failure prediction of HPCC. The available techniques are old, unrealistic and provide low accuracy. This paper presents an efficient technique which provides robust failure prediction with good accuracy and state of the art models. We have employed the concept of long short-term memory (LSTM) with reinforcement learning to correct the prediction accuracy in real-time and provide a solution to the industry with reliable results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call