Abstract

Applications like identifying different customers from their unique buying behaviours, determining ratings of a product given by users based on different sets of features, etc. require classification using class-specific subsets of features. Most of the existing state-of-the-art classifiers for multivariate data use complete feature set for classification regardless of the different class labels. Decision tree classifier can produce class-wise subsets of features. However, none of these classifiers model the relationship between features which may enhance classification accuracy. We call the class-specific subsets of features and the features’ interrelationships as class signatures. In this work, we propose to map the original input space of multivariate data to the feature space characterized by connected graphs as graphs can easily model entities, their attributes, and relationships among attributes. Mostly, entities are modeled using graphs, where graphs occur naturally, for example, chemical compounds. However, graphs do not occur naturally in multivariate data. Thus, extracting class signatures from multivariate data is a challenging task. We propose some feature selection heuristics to obtain class-specific prominent subgraph signatures. We also propose two variants of class signatures based classifier namely: 1) maximum matching signature (gMM), and 2) score and size of matched signatures (gSM). The effectiveness of the proposed approach on real-world and synthetic datasets has been studied and compared with other established classifiers. Experimental results confirm the ascendancy of the proposed class signatures based classifier on most of the datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.