Abstract
Assessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. Then, we use the ten-fold cross-validation method to exercise the base classifier to improve the base classifier’s robustness. We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimization model to solve for the optimal weight of the base classifier, which minimizes the weighted total loss of all base classifiers. Thus, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness.
Highlights
Assessing the default of customers is an essential basis for personal credit issuance. is paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination
Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness
We utilized the heterogeneous ensemble default discriminant model. e ensembled Super Learner model that determines the optimal combination of multiple base classifiers using cross-validation performs well in disease prediction in the medical field. is paper considers introducing the Super Learner algorithm into personal credit default evaluation research to build a default discrimination model with heterogeneous ensemble for better default discrimination accuracy and robustness
Summary
6. Aiming at the minimum weighted total loss of all base classifiers, an optimal weighting model of base classifiers is established. 7. Select the weight vector value that minimizes the weighted total loss of all base classifiers, and combine the base classifiers fitted on the complete dataset to get the Super Learner model. E emergence of selective ensemble overcomes this shortcoming It selects the best-performing base classifier for ensemble or gives different base classifiers different weights. We can construct an infinite set of weighted candidate combination families and select the weighted optimal combination by minimizing cross-validation. Each algorithm is refitted in the complete dataset to generate the final base classifier and weighted combination with each classifier’s weight to generate the Super Learner ensembled model
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.