Abstract
With the development of financial consumption, demand for credit has soared. Since the bank has detailed client data, it is important to build effective models to distinguish between high-risk groups and low-risk groups. However, traditional credit evaluation methods including expert opinion, credit rating and credit scoring are very subjective and inaccurate. Moreover, the data are highly unbalanced since the number of high-risk groups is significantly less than that of low-risk groups. Progress in machine learning makes it possible to conduct accurate credit analysis. The tree-based machine learning models are particularly suitable for the unbalanced credit data by weighting the credit individuals. We apply a series of tree-based machine learning models to analyze the German Credit Data from the UCI Repository of Machine Learning Databases.
Highlights
With the development of economic and information globalization, a lot of business data and customer information data are collected and saved in banking system
The rating method is a method of quantifying customer credit rating after comprehensive analysis of customer information
Credit scoring method is subjective since the risk factors and weights of these factors are subjective setting
Summary
With the development of economic and information globalization, a lot of business data and customer information data are collected and saved in banking system. Machine learning models are more comprehensive, scientific, objective and fair, which can improve the accuracy of credit evaluation significantly. Credit evaluation analysis can be understood as a representative classification problem in machine learning, since our goal is distinguishing between high-risk groups. The unbalanced feature presents great challenge in credit evaluation analysis, because the number of important high-risk individuals is smaller and the corresponding information is less. We employ several tree-based machine learning models to conduct credit evaluation, because these tree-based models can handle the unbalanced feature of credit data by weighting on the individual samples. We make a brief conclusion about our results and discuss the advantages and disadvantages of our methods
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have