The curse of dimensionality of features in data classification is still an open issue. An approach to solve this problem is to partition features into several sub-sets of features hence the data classification task for every subset is performed. Then, an ensemble of these classifications are reported as the result of the classification problem. However, the feature set partitioning into sub-sets of features is still an area of research interest. Thus, in this paper, an innovative framework is proposed in which, first, a collaboration measure between each two features is defined and measured. Then, the collaboration graph, consisted of features as nodes and measured collaborations as edges’ weights, is generated according to the collaboration measures calculated. After that, a community detection method is used to find the graph communities. The communities are considered as the feature subsets and a base classifier is trained for each subset based on the corresponding training data of the subsets. Then, the ensemble classifier is created by a combination of base classifiers according to the AdaBoost Aggreagation. The simulation results of the proposed approach over the real and synthetic datasets indicate that the proposed approach considerably increases the classification accuracy in comparison to previous methods.
Read full abstract