Abstract

High Performance Computing (HPC) systems play an important role in advancing scientific research due to a significant demand for processing power and speed grows. In practice, HPC systems are in the spot of interest of different businesses which account on this growing technology. The growing complexity of the HPC systems made it exposed to a great range of performance anomalies. Permanent management of such systems health has a huge impact financially and operationally. Several machine learning techniques can be used to identify these performance anomalies in such complex systems. This study compares the most commonly used three supervised machine learning algorithms for anomaly detection. We had applied these algorithms on the Fundacion Publica Galega Centro Tecnoloxico de Supercomputacion de Galicia (CESGA) memcpy metrics which is a benchmark used to measure memory performance for each CPU socket. Our study shows that Neural Network algorithm had the highest accuracy (93%), KNN algorithm had the highest value of precision (0.97), Gaussian Anomaly Detection algorithm had the highest value of recall (0. 99), and Neural Network algorithm had the highest value of F-measure (0.96).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.