Abstract

Incorporation of prepruned decision trees to k-means clustering through one to three types of tree-depth controllers and cluster partitioning was done to develop a combined algorithm named as Greedy Pre-pruned Tree-based Clustering (GPrTC) algorithm. Pre-pruned clustered decision trees are applied in a greedy concerted way to five datasets of obstructive sleep apnea and others from online data repositories. The optimal number of k clusters for k-means clustering is determined after trees are greedily prepruned by tree-depth controllers of minimum number of leaf nodes, minimum number of parent nodes and maximum number of tree splitting. After applying the GPrTC algorithm to the assigned datasets, when compared with the conventional k-means clustering, results showed that the former has significantly lower average distortion per point and lower average run-time for 2-D and 3-D data over around 30 thousand points. Classification efficiency and speed of the former algorithm is more than two times better the latter algorithm over a higher range of points being run. GPrTC algorithm showed better classification accuracies than k-means clustering in almost all the assigned datasets. This concludes that the proposed algorithm is significantly much more efficient, less distortion and much faster than k-means clustering with moderately better in terms of classification and/or prediction accuracies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call