Workload Characterization in HPC Environment for Auto-scaling of Resources – Preliminary Study

Mahesh Barve,Rahul Padmakar Hardikar,Sharad Sinha,Writam Mallik,Ashok Gunturu

doi:10.1109/indicon56171.2022.10040124

Abstract

Machine learning and deep learning techniques have been shaping many sectors, one of them is High-Performance Computing. As predicting workloads assists decision-making and resource management, many statistical and machine learning techniques have been employed to predict workloads. This paper presents several machine learning and ensemble learning techniques along with a neural network approach to predict the run-time of a scientific application CP2K. It is a molecular dynamics simulation software. Its run-time was characterized under various resource allocation and algorithmic parameters. The data thus generated were used to train Random Forest, SVM, XGBoost, CatBoost, GradientBoost, and Neural Network models. The trained models help predict its run time under different resource allocation and algorithmic parameters thus enabling better resource allocation request by HPC users. Though all the models have shown promising results during both training and testing phases, the Random Forest has beaten all the models in terms of Mean Squared Error (MSE).

Full Text