Abstract

This paper aims to explore the extent to which text data contains valuable information for predicting oil futures returns. A novel mixed-frequency data sampling random forest regression (MIDAS-RF) approach is proposed to construct a textual indicator. This approach can extract nonlinearity and interaction information from news and allows us to better handle the mixed-frequency and high-dimensional data. Comparing it with traditional sentiment variables and financial factors, our indicator demonstrates better forecasting performance both statistically and economically, with a monthly out-of-sample R2 of 5.26% and an annualized certainty equivalent return gain of 3.08%, respectively. Further evidence suggests that the predictability of the textual indicator is primarily driven by words related to capital markets and macroeconomic topics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call