Abstract

This paper focuses on a new type of taxonomy called supervised taxonomy (ST). Supervised taxonomies are generated considering background information concerning class labels in addition to distance metrics, and are capable of capturing class-uniform regions in a dataset. A hierarchical, agglomerative clustering algorithm, called STAXAC that generates STs is proposed and its properties are analyzed. Experimental results are presented that show that STAXAC produces purer taxonomies than the neighbor-joining (NJ) algorithm—a very popular taxonomy generation algorithm. We introduced novel measures and algorithms that assess classification complexity, class modality, and show that STs can be used as the main input of an effective data-editing tool to enhance the accuracy of k-nearest neighbor classifiers. We demonstrated in our experimental evaluation that assessing the classification complexity of a ST provides us with a good estimate of the difficulty of the classification problem at hand. Moreover, a class modality discovery tool (CMD) has been provided that—based on a domain expert's notion of what constitutes a “note-worthy” subclass—determines if specific classes in the dataset are zero-modal, unimodal, and multi-modal.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.