Abstract

Credit scoring is an important tool for banks and lending companies to realize credit risk exposure management and gain profits. GBDTs, a group of boosting-type ensemble algorithms, have shown promising improvement for credit scoring. However, GBDT improves the credit scoring performance by iteratively modifying only the fitting target for each base classifier and invariably works on the same features, which limits the diversity of individual classifiers in GBDT; Moreover, the performance-interpretability dilemma motivated a large number of works to focus on the pursuit of high-performance ensemble strategies, which leads to the lack of explorations on the interpretability of the credit scoring models. Based on the above limitations, two tree-based augmented GBDTs (AugBoost-RFS and AugBoost-RFU) are proposed in this work for credit scoring. In the proposed methods, a step-wise feature augmentation mechanism is introduced for GBDT to enrich the diversity of individual base classifiers; Tree-based embedding technologies simplify the process of feature augmentation and inherit interpretability of GBDT. Results on 4 large-scale credit scoring datasets show AugBoost-RFS/AugBoost-RFU outperforms GBDT; Besides, supervised tree-based step-wise feature augmentation for GBDT achieves comparable results to neural network-based step-wise feature augmentation while significantly improve the augmentation efficiency. Moreover, the intrinsic global interpreted results and decision rules of tree-enhanced GBDTs, as well as the marginal contributions of features that are visualized by TreeSHAP demonstrate AugBoost-RFS/AugBoost-RFU can be good candidates for interpretable credit scoring.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call