Site-specific weather forecasts are essential for accurate prediction of power demand and are consequently of great interest to energy operators. However, weather forecasts from current numerical weather prediction (NWP) models lack the fine-scale detail to capture all important characteristics of localised real-world sites. Instead, they provide weather information representing a rectangular gridbox (usually kilometres in size). Even after post-processing and bias correction, area-averaged information is usually not optimal for specific sites. Prior work on site-optimised forecasts has focused on linear methods, weighted consensus averaging, and time-series methods, among others. Recent developments in machine learning (ML) have prompted increasing interest in applying ML as a novel approach towards this problem. In this study, we investigate the feasibility of optimising forecasts at sites by adopting the popular machine learning model “gradient boosted decision tree”, supported by the XGBoost package (v.1.7.3) in the Python language. Regression trees have been trained with historical NWP and site observations as training data, aimed at predicting temperature and dew point at multiple site locations across Australia. We developed a working ML framework, named “Multi-SiteBoost”, and initial test results show a significant improvement compared with gridded values from bias-corrected NWP models. The improvement from XGBoost (0.1–0.6 °C, 4–27% improvement in temperature) is found to be comparable with non-ML methods reported in the literature. With the insights provided by SHapley Additive exPlanations (SHAP), this study also tests various approaches to understand the ML predictions and increase the reliability of the forecasts generated by ML.
Read full abstract