Abstract

For the sake of credit risk assessment, credit scoring has become a critical tool to discriminate “bad” applicants from “good” applicants for financial institutions. Accordingly, a wide range of supervised machine learning algorithms have been successfully applied to credit scoring; however, integration of unsupervised learning with supervised learning in this field has drawn little consideration. In this work, we propose a combination strategy of integrating unsupervised learning with supervised learning for credit risk assessment. The difference between our work and other previous work on unsupervised integration is that we apply unsupervised learning techniques at two different stages: the consensus stage and dataset clustering stage. Comparisons of model performance are performed based on three credit datasets in four groups: individual models, individual models + consensus model, clustering + individual models, clustering + individual models + consensus model. As a result, integration at either the consensus stage or dataset clustering stage is effective on improving the performance of credit scoring models. Moreover, the combination of the two stages achieves the best performance, thereby confirming the superiority of the proposed integration of unsupervised and supervised machine learning algorithms, which boost our confidence that this strategy can be extended to many other credit datasets from financial institutions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.