Abstract

This paper is concerned with a comparative investigation of the most commonly used attribute selection measures in the construction of decision trees. We examine the effect of these measures on the resulting tree structures. Our experiments study these effects against various sampling policies. The emphasis of earlier works in this field has been on the overall size of the tree, pruned or unpruned, in terms of the number of levels and the number of leaf nodes. We take a more informative view of tree structure which takes the functionality of decision trees into consideration. Our structure-evaluating criterion combines classification proportions with the combinatorial structures. We shall demonstrate that the information-based measures outperform the non-information based ones for unpruned trees against classification proportion thresholds. Among the information-based measures, the information gain appears to be the best. Pruning improves the performance of non-information based measures. We also show that classification performance is not only related to the attribute selection measures but also to the sampling policies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.