Abstract

We model multi-period cumulative and forward corporate default probabilities using machine learning methods and introduce a novel hybrid econometric-machine learning model which combines tree-boosting with a latent frailty model. The latter allows for modeling correlation that is not accounted for by observable predictor variables. We find that machine learning methods have higher prediction accuracy compared to linear models with the differences being larger for longer prediction horizons. The likely reason for this is the presence of stronger interaction effects for longer prediction horizons compared to short horizons. Among all methods, tree-boosting has the highest prediction accuracy. Further, the frailty component of the newly proposed “LaGaBoost frailty model” is overall large and exhibits strong variation over time. In contrast to prior research, we find that upper tail predictions of loan portfolio losses of frailty models are not consistently higher throughout time compared to models ignoring frailty correlation, but they show more temporal variation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call