Credit scoring using data mining techniques with particular reference to Sudanese banks

Eiman Kambal,Izzeldin Osman,Sara Mohammed,Methag Taha,Noon Mohammed

doi:10.1109/icceee.2013.6633966

Abstract

One of the key success factors of lending organizations in general and banks in particular is the assessment of borrower credit worthiness in advance during the credit evaluation process. Credit scoring models have been applied by many researchers to improve the process of assessing credit worthiness by differentiating between prospective loans on the basis of the likelihood of repayment. Thus, credit scoring is a very typical Data Mining (DM) classification problem. Many traditional statistical and modern computational intelligence techniques have been presented in the literature to tackle this problem. The main objective of this paper is to describe an experiment of building suitable Credit Scoring Models (CSMs) for the Sudanese banks. Two commonly discussed data mining classification techniques are chosen in this paper namely: Decision Tree (DT) and Artificial Neural Networks (ANN). In addition Genetic Algorithms (GA) and Principal Component Analysis (PCA) are also applied as feature selection techniques. In addition to a Sudanese credit dataset, German credit dataset is also used to evaluate these techniques. The results reveal that ANN models outperform DT models in most cases. Using GA as a feature selection is more effective than PCA technique. The highest accuracy of German data set (80.67%) and Sudanese credit scoring models (69.74%) are achieved by a hybrid GA-ANN model. Although DT and its hybrid models (PCA-DT, GA-DT) are outperformed by ANN and its hybrid models (PCA-ANN, GA-ANN) in most cases, they produced interpretable loan granting decisions.

Full Text