Abstract

This paper investigates the performance of ensemble boosting trees in forecasting volatility of China's crude oil futures by combining rich feature variables and multiple volatility forecasting models. The empirical results demonstrate that ensemble boosting tree models significantly outperform the HAR-RV model and traditional machine learning models, with the CatBoost and the LightGBM having the best forecasting performance, and that these conclusions hold up under robustness tests. Using the SHAP values model interpretability instrument, this paper analyzes the model interpretability of LightGBM and CatBoost in terms of the drivers of volatility forecasting, the contribution of variables in a specific period, and the performance of variables in forecasting outliers. It is discovered that macroeconomic variables and HAR-type variables have different forecasting contributions in CatBoost and LightGBM, and that the contribution of different variables to the forecasting window varies significantly within a single interval. In addition, the paper concludes that there is heterogeneity in the forecast contribution of the same predictor across models, so the selection of variables for forecasting volatility should be based on the actual situation. Lastly, additional analysis confirms that the ensemble boosting tree models also have a high economic value.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.