Advanced hyperparameter optimization of deep learning models for wind power prediction

Shahram Hanifi,Andrea Cammarono,Hossein Zare-Behtash

doi:10.1016/j.renene.2023.119700

Abstract

The uncertainty of wind power as the main obstacle of its integration into the power grid can be addressed by an accurate and efficient wind power forecast. Among the various wind power forecasting methods, machine learning (ML) algorithms, are recognized as a powerful wind power forecasting tool, however, their performance is highly dependent on the proper tuning of their hyperparameters. Common hyperparameter tuning methods such as grid search or random search are time-consuming, computationally expensive, and unreliable for complex models such as deep learning neural networks. Therefore, there is an urgent need for automatic methods to discover optimal hyperparameters for higher accuracy and efficiency of prediction models. In this study, a novel investigation is contributed to the field of wind power forecasting by a comprehensive comparison of three advanced techniques – Scikit-opt, Optuna, and Hyperopt – for hyperparameter optimization of Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM) models, a facet that, to our knowledge, has not been systematically explored in existing literature. The impact of these optimization techniques on the accuracy and efficiency of the CNN and LSTM models are assessed by comparing the root mean square error (RMSE) of the predictions and the required time to tune the models. The results show that the Optuna algorithm, using a Tree-structured Parzen Estimator (TPE) search method and Expected Improvement (EI) acquisition function, has the best efficiency for both CNN and LSTM models. In terms of accuracy, it is demonstrated that while for the CNN model all the optimization methods achieve similar performances, the LSTM model optimized by the Hyperopt algorithm, based on the annealing search method, results in the highest accuracy. In addition, for the first time in this research, the impact of the random initialization features on the performance of the forecasting models with neural networks is investigated. The proposed structures for deep learning models were examined to determine the most robust structure with the minimal sensitivity to the randomness. What we have discovered from the comparison of advanced hyperparameter optimization methods can be used by researchers to tune the time series-based forecasting models.

Full Text