Abstract

Recent electricity price forecasting (EPF) studies suggest that the least absolute shrinkage and selection operator (LASSO) leads to well performing models that are generally better than those obtained from other variable selection schemes. By conducting an empirical study involving datasets from two major power markets (Nord Pool and PJM Interconnection), three expert models, two multi-parameter regression (called baseline) models and four variance stabilizing transformations combined with the seasonal component approach, we discuss the optimal way of implementing the LASSO. We show that using a complex baseline model with nearly 400 explanatory variables, a well chosen variance stabilizing transformation (asinh or N-PIT), and a procedure that recalibrates the LASSO regularization parameter once or twice a day indeed leads to significant accuracy gains compared to the typically considered EPF models. Moreover, by analyzing the structures of the best LASSO-estimated models, we identify the most important explanatory variables and thus provide guidelines to structuring better performing models.

Highlights

  • One of the challenges of short-term electricity price forecasting (EPF) is variable selection.there is no standard approach when the number of potential explanatory variables is large.So far, the typical approach has been to select predictors in an ad-hoc fashion or by using expert knowledge [1]

  • We conduct a comprehensive empirical study involving nearly 5-year long datasets from two major power markets (Nord Pool and PJM Interconnection), multiple model structures, a range of alternative procedures for estimating λ, four variance stabilizing transformations combined with the seasonal component approach, and an evaluation in terms of the classical error measure for point forecasts and the Diebold–Mariano (DM) test [17] to determine significant differences in forecasting accuracy

  • Models for which the least absolute shrinkage and selection operator (LASSO) operator is used to estimate parameters are denoted by ∗LassoiD, where i = 1, 2 refers to one of two baseline models; D is the length of the validation window; and the asterisk represents the λ-selection scheme

Read more

Summary

Introduction

Concerning variable (or feature) selection, i.e., the optimal structure of the baseline model, we identified the most important variables and provided guidelines to structure better performing expert models. The LASSO typically uses only a small fraction of the initial set of explanatory variables, providing additional information in the underlying model significantly improves the accuracy of the obtained forecasts. Lagged exogenous variables (i.e., load and wind generation forecasts for the past days) are seldom selected and can be ignored when building models. The dummy-linked load forecasts turned out to be of mixed explanatory power, while the dummy-linked price averages for the previous day were shown to be redundant. Regarding the choice of the LASSO tuning parameter, we found one λ for all days and hours in the test period (as in Uniejewski et al [14]) to be an acceptable option, but this is recommended only if the computational time needs to be significantly reduced.

The Test Ground
Seasonal Decomposition
Variance Stabilizing Transformations
The Forecasting Framework
Benchmarks
Baseline Autoregressive Models
LASSO-Estimated Models
LassOLS-Type Models
Evaluation in Terms of MAE and the DM Test
Performance across Model Classes and VSTs
Performance across λ-Selection Schemes
Variable Selection
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call