Abstract

Social lending is made between peers, and with the risk that the investor can take direct damages from the borrower’s failure to repay, accurate default prediction for borrowers is important. The repayment result can be known after the end of the repayment period, and such data is limited. However, social loans are matched online in real time and large amounts of unlabeled data are being generated. In this paper, we propose a method to combine label propagation and transductive support vector machine (TSVM) with Dempster–Shafer theory for accurate default prediction of social lending using unlabeled data. In order to train a lot of data effectively, we ensemble semi-supervised learning methods with different characteristics. Label propagation is performed so that data having similar features are assigned to the same class and TSVM makes moving away data having different features. Dempster–Shafer fusion method allows accurate labeling by exploiting the merits of the two methods. Experiments are performed using the open data set from Lending Club. The accuracy of the proposed method is improved by about 10% against that of the model using only labeled data, and more accurate labeling can be performed through the proposed ensemble method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.