Features selection approaches combined with effective classifiers in credit scoring

Chia-Ching Lin,Feng-Chia Li,Tzu-Chin Chao,Chin-Chih Chang

doi:10.1109/ieem.2011.6118017

Chia-Ching Lin, Feng-Chia Li + Show 2 more

https://doi.org/10.1109/ieem.2011.6118017

Copy DOI

Export

Save

Cite

Publication Date: Dec 1, 2011

Affiliation: Yu Da University

Abstract
Full-Text
Similar Papers

Abstract

Listen

With the rapid growth in the credit industry, credit scoring models are being widely used for credit admission evaluation. Credit scoring has been regarded as a critical topic, with its related departments striving to collect huge amounts of data to avoid making the wrong decision. Finding effective classificatory models is important because it will help managers make an objective decision instead of them having to rely merely on intuitive experience. This study proposes three approaches which combine two well-known classifiers, namely, K-Nearest Neighbor (KNN) and Support Vector Machine (SVM), to find the best hybrid classifier combination. Features selection retains sufficient information for classification purposes. Different credit scoring combinations are constructed by selecting features with three approaches and two classifiers. Two credit data sets from University of California, Irvine (UCI) are chosen to evaluate the accuracy of various hybrid features selection models. KNN abd SVM classifiers combine with linear discriminate analysis (LDA), Rough sets (RST), and F-score approaches as a features preprocessing step to optimize features space by removing both irrelevant and redundant features. In this paper, the procedures of the proposed approaches are described and then evaluated by their performances. The results are compared and nonparametric test will be performed to show if there is any significant difference between these models. Performances of the F-score approach combined with effective classifiers are brilliant among the two data sets. The result of this study suggests that the hybrid credit scoring approach is mostly robust and effective in finding optimal subsets and is a promising method in the field of data mining.

Full Text