A Credit Scoring Heterogeneous Ensemble Model Using Stacking and Voting

C J Anil Kumar,B K Raghavendra,S Raghavendra

doi:10.17485/ijst/v15i7.1715

Abstract

Background/Objectives: Recent studies emphasized on using ensemble models over single ones to solve credit scoring problems. The objective of this study is to build a heterogeneous ensemble classifier model with an improved classification accuracy. Methods: This study focuses on developing a heterogeneous ensemble classifier using Logistic Regression, K-nearest neighbor, Decision tree, Random Forest, Naïve Base and Support vector machine as base classifiers and Random Forest, Logistic Regression and Support vector machine as meta-classifiers. The proposed model is built using these six base classifiers for ensemble aggregation. A feature selection algorithm based on the random forest technique is used for selecting the best features. A stacking and voting method are used for building ensemble model. Findings: The ensemble classifier gives superior predictive performance than single classifiers SVM, DT, RF, NB, KNN and LR with an accuracy of 91.56% for Australian dataset and 84.35% for German dataset. Novelty: The proposed model uses stacking and majority voting method for ensemble classification. Initially, stacking is applied to the base classifiers. This is done in two levels. First the training dataset is split into 10 folds for cross validation. The output of each classifier is taken, and the dataset is updated with the meta-features. In the second level, three meta-classifiers (MC), namely LR, SVM and RF are used. Majority voting is applied to the output of these meta-classifiers for the prediction. Keywords: Credit scoring; ensemble model; SVM; DT; RF; NB; KNN; LR

Highlights

A credit scoring model is an analysis tool used to determine the creditworthiness of a loan applicant based on historical data and by estimating the default probability
The models are designed by training single base classifiers and the resulting output is integrated by using an ensemble strategy to enhance the performance
Support Vector Machine (SVM), Logistic Regression (LR), K-Nearest Neighbor (KNN), Random Forest (RF), Naive Base, Decision Tree (DT) are used as base models

Summary

Introduction

A credit scoring model is an analysis tool used to determine the creditworthiness of a loan applicant based on historical data and by estimating the default probability. The performance of the credit scoring model is proven to be more effective by using ensemble modeling. The credit scoring model is used to assess the credit risk of a new applicant(2) or to assess the likelihood of a default using information from a previous loan applicant (3). The 2 most commonly and widely used statistical methods in credit scoring are Logistic Regression (LR) and Linear Discriminant Analysis (LDA). Machine learning classification approaches like K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), Naïve Base (NB), Classification and Regression Tree (CART), Genetic Algorithms (GA), and Artificial Neural Networks (ANN) are extensively used in credit scoring

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Indian Journal of Science and Technology	Publication Date: Feb 21, 2021
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

A Credit Scoring Heterogeneous Ensemble Model Using Stacking and Voting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology

Lead the way for us

Similar Papers

A systematic credit scoring model based on heterogeneous classifier ensembles
Maher Ala'Raj ... Maysam Abbod
-
Maher Ala'Raj, et. al.Maher Ala'Raj ... Maysam Abbod
01 Sep 2015
01 Sep 2015

Performance Evaluation of Homogeneous and Heterogeneous Ensemble Models for Groundwater Salinity Predictions: a Regional-Scale Comparison Study
Alvin Lal ... Bithin Datta
Water, Air, & Soil Pollution | VOL. 231
Alvin Lal, et. al.Alvin Lal ... Bithin Datta
01 Jun 2020
Water, Air, & Soil Pollution | VOL. 231

A novel heterogeneous ensemble credit scoring model based on bstacking approach
Yufei Xia ... Fangming Xie
Expert Systems with Applications | VOL. 93
Yufei Xia, et. al.Yufei Xia ... Fangming Xie
10 Oct 2017
Expert Systems with Applications | VOL. 93

Data based prediction of cancer diagnoses using heterogeneous model ensembles
Stephan M Winkler ... Herbert Stekel
-
Stephan M Winkler, et. al.Stephan M Winkler ... Herbert Stekel
12 Jul 2014
12 Jul 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Credit Scoring Heterogeneous Ensemble Model Using Stacking and Voting

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Indian Journal of Science and Technology