Abstract

In classification, a decision tree is a common model due to its simple structure and easy understanding. Most of decision tree algorithms assume all instances in a dataset have the same degree of confidence, so they use the same generation and pruning strategies for all training instances. In fact, the instances with greater degree of confidence are more useful than the ones with lower degree of confidence in the same dataset. Therefore, the instances should be treated discriminately according to their corresponding confidence degrees when training classifiers. In this paper, we investigate the impact and significance of degree of confidence of instances on the classification performance of decision tree algorithms, taking the classification and regression tree (CART) algorithm as an example. First, the degree of confidence of instances is quantified from a statistical perspective. Then, a developed CART algorithm named C_CART is proposed by introducing the confidence of instances into the generation and pruning processes of CART algorithm. Finally, we conduct experiments to evaluate the performance of C_CART algorithm. The experimental results show that our C_CART algorithm can significantly improve the generalization performance as well as avoiding the over-fitting problem to a certain extend.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.