Abstract
Credit scoring plays an important role in financial institutions and debt based crowdfunding platforms as well as peer to peer lending platforms. In the last few years, adopting ensemble methods for credit scoring has become much more popular. However, the performance of ensemble methods is easily affected by the parameter settings and the number of base classifiers. Ensemble classification based on clustering is able to determine the best number of base classifiers automatically by clustering and find optimal parameter settings for base classifiers by training them individually on the training subsets combined by clusters. By this way, the adverse effect of manually setting the parameters and the number of base classifiers can be avoided. However, the different contributions of attributes to the distance metrics are not considered in conventional clustering methods, which may decrease the performance of ensemble classifiers based on them. Moreover, unbalanced training subsets decrease the performance of base classifiers, which results in the bad performance of ensemble classifiers. In our approach, to address the above problems, we first assign different weights to different variables when measuring the distance between two instances in the clustering step, and then adopt Subagging resampling method to deal with unbalanced training subsets in the training process. Experimental results show that our approach can improve the performance of the ensemble classifier.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.