Abstract

Aiming at the problems of low efficiency and excessive fitting in data mining classification processing of decision tree algorithm. Therefore, in the process of data mining, the C4.5 algorithm was deeply studied and an improved algorithm, namely BC4.5 algorithm, was proposed. The main idea of the proposed algorithm is a branch of the improved C4.5 algorithm and the Pruning strategy measure and adjust the C4.5 algorithm in the attribute information gain rate scope, comparing the information gain and probability is obtained by bayesian classifier, use a simplified CCP (Cost-Complexity Pruning) method and evaluation standard, the procedure of the subtree root node has to generate the decision tree surface five check five gain value, to determine whether to remove the decision tree nodes and branches. Simulation experiments are conducted on the improved C4.5 algorithm and the traditional algorithm. The results showed that the improved C4.5 algorithm has a significant improvement in execution time, which is 8.75% shorter than the traditional algorithm. With the increase of the number of experiments, the accuracy rate of the improved algorithm reaches more than 90%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.