Abstract

Random forest (RF) is an ensemble classifier method, all decision trees participate in voting, some low-quality decision trees will reduce the accuracy of random forest. To improve the accuracy of random forest, decision trees with larger degree of diversity and higher classification accuracy are selected for voting. In this paper, the RF based on Kappa measure and the improved binary artificial bee colony algorithm (IBABC) are proposed. Firstly, Kappa measure is used for pre-pruning, and the decision trees with larger degree of diversity are selected from the forest. Then, the crossover operator and leaping operator are applied in ABC, and the improved binary ABC is used for secondary pruning, and the decision trees with better performance are selected for voting. The proposed method (Kappa+IBABC) are tested on a quantity of UCI datasets. Computational results demonstrate that Kappa+IBABC improves the performance on most datasets with fewer decision trees. The Wilcoxon signed-rank test is used to verify the significant difference between the Kappa+IBABC method and other pruning methods. In addition, Chinese haze pollution is becoming more and more serious. This proposed method is used to predict haze weather and has achieved good results.

Highlights

  • Random forest (RF) is an ensemble classifier method proposed by Breiman [1]

  • The diversity and average precision of decision trees in random forest are two important indexes to improve the performance of random forest

  • 7) ALGORITHM STEPS The pseudo-code of random forest based on Kappa measure and improved binary artificial bee colony algorithm (IBABC) is demonstrated as Algorithm4 (Figure 6)

Read more

Summary

INTRODUCTION

RF is an ensemble classifier method proposed by Breiman [1]. Random forest comprises tree classifiers, and meta-classifiers are decision trees constructed by classification and regression trees (CART). (3) The crossover operator and leaping operator are applied in ABC, and the improved binary ABC is used for secondary pruning, and the decision trees with better performance are selected for voting. RANDOM FOREST BASEN ON KAPPA PRUNING AND THE IMPROVED BINARY ARTIFICIAL BEE COLONY ALGORITHM Random forest integrates all decision trees to get the final result. To improve the accuracy of random forest, a novel RF based on Kappa pruning and IBABC is proposed In this method, the random forest is pre-pruned by the kappa measure, the CARTs with poor comprehensive performance are eliminated, and the complexity is significantly reduced. A new candidate solution vi is generated by the randomly selected food source xk in population and the current food source xi. 7) ALGORITHM STEPS The pseudo-code of random forest based on Kappa measure and IBABC is demonstrated as Algorithm (Figure 6)

EXPERIMENTAL RESULTS
EXPERIMENTAL RESULTS ON THE BENCHMARK
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call