AbstractPredicting the execution time of weather forecast models is a complex task, since these models are usually performed on High Performance Computing systems that require large computing capabilities. Indeed, a reliable prediction can imply several benefits, by allowing for an improved planning of the model execution, a better allocation of available resources, and the identification of possible anomalies. However, to make such predictions is usually hard, since there is a scarcity of datasets that benchmark the existing meteorological simulation models. In this work, we focus on the runtime predictions of the execution of the COSMO (COnsortium for SMall-scale MOdeling) weather forecasting model used at the Hydro-Meteo-Climate Structure of the Regional Agency for the Environment and Energy Prevention Emilia-Romagna. We show how a plethora of Machine Learning approaches can obtain accurate runtime predictions of this complex model, by designing a new well-defined benchmark for this application task. Indeed, our contribution is twofold: 1) the creation of a large public dataset reporting the runtime of COSMO run under a variety of different configurations; 2) a comparative study of ML models, which greatly outperform the current state-of-practice used by the domain experts. This data collection represents an essential initial benchmark for this application field, and a useful resource for analyzing the model performance: better accuracy in runtime predictions could help facility owners to improve job scheduling and resource allocation of the entire system; while for a final user, a posteriori analysis could help to identify anomalous runs.