Abstract

Credit score is the basis for financial institutions to make credit decisions. With the development of science and technology, big data technology has penetrated into the financial field, and personal credit investigation has entered a new era. Personal credit evaluation based on big data is one of the hot research topics. This paper mainly completes three works. Firstly, according to the application scenario of credit evaluation of personal credit data, the experimental dataset is cleaned, the discrete data is one-HOT coded, and the data are standardized. Due to the high dimension of personal credit data, the pdC-RF algorithm is adopted in this paper to optimize the correlation of data features and reduce the 145-dimensional data to 22-dimensional data. On this basis, WOE coding was carried out on the dataset, which was applied to random forest, support vector machine, and logistic regression models, and the performance was compared. It is found that logistic regression is more suitable for the personal credit evaluation model based on Lending Club dataset. Finally, based on the logistic regression model with the best parameters, the user samples are graded and the final score card is output.

Highlights

  • With the rapid development of big data technology, the credit information system has entered a new era

  • Big data play an important role in predicting and evaluating economic credit. rough the test of lending institutions, it is proved that [2] a special test within lending institutions is 18.4% lower than the big data credit score forecast for evaluating whether individuals will default on loans, and it has great advantages in predicting people who have never had a borrowing record and correcting financial reporting errors. e importance of credit evaluation score is becoming more and more prominent, and any institution needs to avoid some risks through a high-performance credit scoring model [3]

  • Based on the data preprocessing and dimensionality reduction in the previous section, the personal credit evaluation model of this paper starts with data partition, carries out box processing on the data, and completes the discretization of the data. en, based on the related algorithms of machine learning, the samples are trained, and the performance of each machine learning algorithm is compared to obtain the best performance training model

Read more

Summary

Introduction

With the rapid development of big data technology, the credit information system has entered a new era. Put aside the technical research on credit scoring technology; adopt alternative data sources to improve the performance of statistical and economic models. In this way, the big data source calls the network, and the feature selection increases the profit value and uses the dataset to create the score of users applying for credit cards. Min-max standardization and one-hot coding are used to standardize the continuous variables and discrete variables in the dataset, respectively, and the first step of the data preprocessing stage is completed at this time. The data are reduced from 74 dimensions to 21 dimensions, and the second step of data preprocessing is completed

Construction of Personal Credit Rating Evaluation Model
Data Partition
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call