Abstract

The credit scoring system has been revolutionized with the development of the financial system and has received increasing attention from the academia and industry. Artificial intelligence technology has reshaped credit scoring through predictive classification. In this study, a new hybrid ensemble model with voting-based outlier detection and balanced sampling is proposed to achieve superior predictive power for credit scoring. To avoid noise-filled data from misleading the classifier training, a new voting-based outlier detection method is proposed to enhance the classic outlier detection algorithms with the weighted voting mechanism and boost the outlier scores into the training set to form an outlier-adapted training set. To reduce the information loss caused by under-sampling when dealing with imbalanced data, a new bagging-based balanced sampling method is proposed to enhance the traditional under-sampling methods with the bagging strategy to obtain a balanced training set. To further improve the performance of the proposed model, a stacking-based ensemble modeling method is proposed to first perform parametrical optimization and then construct the stacking-based multi-stage ensemble model. Five datasets from the UC Irvine machine learning repository and five evaluation indicators were adopted to evaluate the model performance. The experimental results indicate the superior performance of the proposed model and prove its robustness and effectiveness.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.