Feature Engineering for Credit Risk Evaluation in Online P2P Lending

Shuxia Wang,Zhengshen Jiang,Bin Fu,Zhonghai Wu,Hongzhi Liu,D Frank Hsu

doi:10.4018/ijssci.2017040101

Abstract

The rise of online P2P lending, as a novel economic lending model, brings new opportunities and challenges for the research of credit risk evaluation. This paper aims to mine information from different data sources to improve the performance of credit risk evaluation models. Be-sides the personal financial and demographic data used in traditional models, the authors collect in-formation from (1) text description, (2) social network and (3) macro-economic data. They de-sign methods to extract features from unstructured data. To avoid the curse of dimensionality caused by too many features and identify the key factors in credit risk, the authors remove the irrelevant and redundant features by feature selection. Using the data provided by Prosper.com, one of the biggest P2P lending platforms in the world, they show that: (1) it can achieve better performance, measured by both AUC (area under the receiver operating characteristic curve) and classification accuracy, by fusion of information from different data sources; (2) it requires only ten features from different data sources to get better performance.

Full Text