Abstract

Despite the advances in machine learning (ML) methods which have been extensively applied in credit scoring with positive results, there are still very important unresolved issues, pertaining not only to academia but to practitioners and the industry as well, such as model drift as an inevitable consequence of population drift and the strict regulatory obligations for transparency and interpretability of the automated profiling methods. We present a novel adaptive behavioral credit scoring scheme which uses online training for each incoming inquiry (a borrower) by identifying a specific region of competence to train a local model. We compare different classification algorithms, i.e., logistic regression with state-of-the-art ML methods (random forests and gradient boosting trees) that have shown promising results in the literature. Our data sample has been derived from a proprietary credit bureau database and spans a period of 11 years with a quarterly sampling frequency, consisting of 3,520,000 record-months observations. Rigorous performance measures used in credit scoring literature and practice (such as AUROC and the H-Measure) indicate that our approach deals effectively with population drift and that local models outperform their corresponding global ones in all cases. Furthermore, when using simple local classifiers such as logistic regression, we can achieve comparable results with the global ML ones which are considered “black box” methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call