Abstract

We study the impact of machine learning (ML) models for credit default prediction in the calculation of regulatory capital by financial institutions. We do so by using a unique and anonymized database from a major Spanish bank. We first compare the statistical performance of five models based on supervised learning like Logistic Lasso, Trees (CART), Random Forest, XGBoost and Deep Learning, with a well-known model like Logit. We measure the statistical performance through different metrics, and for different sample sizes and features available. We find that ML models outperform, even when relatively low amount of data is used. We then translate this statistical performance into economic impact by estimating the savings in capital when using an advanced ML model instead of a simpler one to compute the risk-weighted assets following the Internal Ratings Based (IRB) approach. Our benchmark results show that implementing XGBoost instead of Logistic Lasso could yield savings from 12.4% to 17% in terms of regulatory capital requirements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call