Abstract

A new split attribute measure for decision tree node split during decision tree creation is proposed. The new split measure consists of the sum of class counts of distinct values of categorical attributes in the dataset. Larger counts induce larger partitions and smaller trees there by favors to the determination of the best spit attribute. The new split attribute measure is termed as maximum exponential class counts (MECC). Experiment results obtained over several UCI machine learning categorical datasets predominantly indicate that the decision tree models created based on the proposed MECC node split attribute technique provides better classification accuracy results and smaller trees in size than the decision trees created using popular gain ratio, normalized gain ratio and gini-index measures. The experimental results are mainly focused on performing and analyzing the results from the node splitting measures alone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call