High-quality short-term forecasts of wind farm generation are crucial for the dynamically developing renewable energy generation sector. This article addresses the selection of appropriate gradient-boosted decision tree models (GBDT) for forecasting wind farm energy generation with a 10-min time horizon. In most forecasting studies, authors utilize a single gradient-boosted decision tree model and compare its performance with other machine learning (ML) techniques and sometimes with a naive baseline model. This paper proposes a comprehensive comparison of all gradient-boosted decision tree models (GBDTs, eXtreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), and Categorical Boosting (CatBoost)) used for forecasting. The objective is to evaluate each model in terms of forecasting accuracy for wind farm energy generation (forecasting error) and computational time during model training. Computational time is a critical factor due to the necessity of testing numerous models with varying hyperparameters to identify the optimal settings that minimize forecasting error. Forecast quality using default hyperparameters is used here as a reference. The research also seeks to determine the most effective sets of input variables for the predictive models. The article concludes with findings and recommendations regarding the preferred GBDT models. Among the four tested models, the oldest GBDT model demonstrated a significantly longer training time, which should be considered a major drawback of this implementation of gradient-boosted decision trees. In terms of model quality testing, the lowest nRMSE error was achieved by the oldest model—GBDT in its tuned version (with the best hyperparameter values obtained from exploring 40,000 combinations).
Read full abstract