Abstract
This paper aims to explore the extent to which text data contains valuable information for predicting oil futures returns. A novel mixed-frequency data sampling random forest regression (MIDAS-RF) approach is proposed to construct a textual indicator. This approach can extract nonlinearity and interaction information from news and allows us to better handle the mixed-frequency and high-dimensional data. Comparing it with traditional sentiment variables and financial factors, our indicator demonstrates better forecasting performance both statistically and economically, with a monthly out-of-sample R2 of 5.26% and an annualized certainty equivalent return gain of 3.08%, respectively. Further evidence suggests that the predictability of the textual indicator is primarily driven by words related to capital markets and macroeconomic topics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have