Forecasting oil futures returns with news

Zhiyuan Pan,Hao Zhong,Yudong Wang,Juan Huang

doi:10.1016/j.eneco.2024.107606

Abstract

This paper aims to explore the extent to which text data contains valuable information for predicting oil futures returns. A novel mixed-frequency data sampling random forest regression (MIDAS-RF) approach is proposed to construct a textual indicator. This approach can extract nonlinearity and interaction information from news and allows us to better handle the mixed-frequency and high-dimensional data. Comparing it with traditional sentiment variables and financial factors, our indicator demonstrates better forecasting performance both statistically and economically, with a monthly out-of-sample R2 of 5.26% and an annualized certainty equivalent return gain of 3.08%, respectively. Further evidence suggests that the predictability of the textual indicator is primarily driven by words related to capital markets and macroeconomic topics.

Full Text