Abstract

AbstractThe assessment of credit risk for P2P lending platform applicants is critical to investors. Feature engineering is an essential technique in distilling classification knowledge during the credit risk prediction data preprocessing stage. Although previous literature used feature selection methods to identify key features, feature transformation is more useful in discovering intrinsic nonlinear characteristics in credit data. In this study, we propose a synthetic multiple tree‐based feature transformation method to generate features. Multiple tree‐based feature transformation methods are employed and fused to acquire a new feature set. The bagging‐based tree ensemble feature transformation method (Bagging‐TreeEnsembleFT) and boosting‐based tree ensemble feature transformation method (Boosting‐TreeEnsembleFT) are two types of feature transformation methods that we specifically propose to validate their effect. We verify the credit risk prediction performance using the proposed synthetic feature transformation methods on real P2P Lending credit datasets. Empirical analysis demonstrates that tree‐based ensemble feature transformation methods with boosting ensemble strategy achieve better prediction performance on various datasets corresponding to different partitions and class distributions compared to tree‐based ensemble feature transformation methods with bagging ensemble strategy and individuals. Moreover, the proposed synthetic feature transformation method improves the credit risk prediction performance in terms of accuracy, AUC, and F1‐score.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call