Machine learning and oil price point and density forecasting

Alexandre Bonnet R Costa,Pedro Cavalcanti G Ferreira,Wagner P Gaglianone,Osmani Teixeira C Guillén,João Victor Issler,Yihao Lin

doi:10.1016/j.eneco.2021.105494

Alexandre Bonnet R Costa, Pedro Cavalcanti G Ferreira + Show 4 more

Open Access

https://doi.org/10.1016/j.eneco.2021.105494

Copy DOI

Abstract

The purpose of this paper is to explore machine learning techniques to forecast the oil price. In the era of big data, we investigate whether new automated tools can improve over traditional approaches in terms of forecast accuracy. Oil price point and density forecasts are built from 23 methods, including regression trees (random forest, quantile regression forest, xgboost), regularization procedures (elastic net, lasso, ridge), standard econometric models and forecast combinations, besides the structural factor model of Schwartz and Smith (2000). The database contains 315 macroeconomic and financial variables, used to build high-dimensional models. To evaluate the predictive power of each method, an extensive pseudo out-of-sample forecasting exercise is built, in monthly and quarterly frequencies, with horizons from one month up to five years. Overall, the results indicate a good performance of the machine learning methods in the short-run. Up to six months, lasso-based models, oil future prices, VECM and the Schwartz–Smith model provide the best forecasts. At longer horizons, forecast combinations also become relevant. In several cases, the accuracy gains in respect to the random walk forecast are statistically significant and reach two-digit figures, in percentage terms, using the R2 out-of-sample statistic; an expressive achievement compared to the previous literature.

Full Text