Abstract
A personal credit evaluation algorithm is proposed by the design of a decision tree with a boosting algorithm, and the classification is carried out. By comparison with the conventional decision tree algorithm, it is shown that the boosting algorithm acts to speed up the processing time. The Classification and Regression Tree (CART) algorithm with the boosting algorithm showed 90.95% accuracy, slightly higher than without boosting, 90.31%. To avoid overfitting of the model on the training set due to unreasonable data set division, we consider cross-validation and illustrate the results with simulation; hypermeters of the model have been applied and the model fitting effect is verified. The proposed decision tree model is fitted optimally with the help of a confusion matrix. In this paper, relevant evaluation indicators are also introduced to evaluate the performance of the proposed model. For the comparison with the conventional methods, accuracy rate, error rate, precision, recall, etc. are also illustrated; we comprehensively evaluate the model performance based on the model accuracy after the 10-fold cross-validation. The results show that the boosting algorithm improves the performance of the model in accuracy and precision when CART is applied, but the model fitting time takes much longer, around 2 min. With the obtained result, it is verified that the performance of the decision tree model is improved under the boosting algorithm. At the same time, we test the performance of the proposed verification model with model fitting, and it could be applied to the prediction model for customers’ decisions on subscription to the fixed deposit business.
Highlights
As a classification function approximation method, the decision tree is developed from the field of machine learning [1]
The decision tree algorithm gradually developed a series of algorithms, such as Iterative Dichotomizer3 (ID3) algorithm, C4.5 algorithm, C5.0 algorithm, Classification and Regression Tree (CART) algorithm, and so on [6]
The algorithms used in this paper are C5.0 algorithm and CART algorithm, both of which are evolved from the previous algorithm, and their comprehensive performance has been improved [6]
Summary
As a classification function approximation method, the decision tree is developed from the field of machine learning [1]. Hunt et al proposed that the concept learning system is the earliest decision tree algorithm [5]. The decision tree algorithm gradually developed a series of algorithms, such as Iterative Dichotomizer (ID3) algorithm, C4.5 algorithm, C5.0 algorithm, Classification and Regression Tree (CART) algorithm, and so on [6]. C5.0 algorithm is an intuitive and efficient classification method, but it has the problems of information gain rate calculation complexity, and is prone to overfitting and decision tree bias. To solve these problems, the calculation process of the information gain rate is simplified by formula transformation. A classifier ensemble was proposed to enhance diversity, and it provided a near-optimal classifying system [8,9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.