In this paper, the incremental random forest algorithm is proposed for the classification and prediction problem of dynamically increasing data. Traditional batch machine learning algorithms perform modeling at one time and cannot allow newly generated samples to participate in learning, which leads to too much model deviation. This paper combines incremental learning with random forest and proposes incremental random forest. Applying this algorithm to the problem of predicting credit card customer default behavior can help banks control risks and reduce losses. It is important to conduct card issuance audits on card issuers and early warning of risks to cardholders. The algorithm performed better in the experiment of predicting the default behavior of credit card customers based on a batch of credit card holder data of a bank in Taiwan. Compared with random forest, decision tree, logistic regression, naive bayes, BP neural network, and support vector machine, it has relatively better performance in our experiment.
Read full abstract