Abstract

Credit scoring in online Peer-to-Peer (P2P) lending faces a huge challenge, which is the credit scoring models discard rejected applicants. This selective discarding leads to bias in the parameters of the models and ultimately affects the performance of credit evaluation. One approach for handling this problem is to adopt reject inference, which is a technique that infer the status of rejected samples and incorporate the results into credit scoring models. The most popular practice of reject inference is to use a credit scoring model that is only built on accepted samples to directly predict the status of rejected samples. However, the distribution of accepted samples in online P2P lending is different from rejected samples. We propose SSL-EC3, a global semi-supervised framework that merges multiple classifiers and clustering algorithms together to make better use of the information of rejected samples. It uses multiple unsupervised models (clustering algorithms) to explore the internal relationships of all samples, and then incorporates the information into the ensemble of supervised models (classifiers) to help correct initial classification results of rejected samples. In addition, we try to use a dynamic ensemble selection (DES) to select the appropriate ensemble of classifiers for each sample to be classified. Experimental results on the real data sets demonstrate the benefits of the proposed methods over conventional methods based on the reject inference.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call