Abstract

Applications like customer identification from their peculiar purchase patterns require class-wise discriminative feature subsets called as class signatures for classification. If the classifiers like KNN, SVM, etc. which require to work with a complete feature set, are applied to such applications, then the entire feature set may introduce errors in the classification. Decision tree classifier generates class-wise prominent feature subsets and hence, can be employed for such applications. However, all of these classifiers fail to model the relationship between features present in vector data. Thus, we propose to model the features and their interrelationships as graphs. Graphs occur naturally in protein molecules, chemical compounds, etc. for which several graph classifiers exist. However, multivariate data do not exhibit the graphs naturally. Thus, the proposed work focuses on (1) modeling multivariate data as graphs and (2) obtaining class-wise prominent subgraph signatures which are then used to train classifiers like SVM for decision making. The proposed method dSubSign can also classify multivariate data with missing values without performing imputation or case deletion. The performance analysis of both real-world and synthetic datasets shows that the accuracy of dSubSign is either higher or comparable to other existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call