Energy forecasting models deployed in industrial applications face uncertainty w.r.t. data availability, due to network latency, equipment malfunctions or data-integrity attacks. In particular, the case when a subset of features that has been used for model training becomes unavailable when the model is used operationally poses a major challenge to forecasting performance. Ad-hoc solutions, e.g., retraining without the missing features, may work for a small number of features, but they soon become impractical, as the number of models grows exponentially with the number of features. In this work, we present a principled approach to introducing resilience against missing features in energy forecasting applications via robust optimization. Specifically, we formulate a robust regression model that is optimally resilient against missing features at test time, considering both point and probabilistic forecasting. We develop three solution methods for the proposed robust formulation, all leading to Linear Programming problems, with varying degrees of tractability and conservativeness. We provide an extensive empirical validation of the proposed methods in prevalent applications, namely, electricity price, load, wind production, and solar production, forecasting, and we further compare against well-established benchmark models and methods of dealing with missing features, i.e., imputation and retraining. Our results demonstrate that the proposed robust optimization approach outperforms imputation-based models and exhibits similar performance to retraining without the missing features, while also maintaining computational practicality. To the best of our knowledge, this is the first work that introduces resilience against missing features into energy forecasting.
Read full abstract