CGAAM – An Algorithm for Simultaneous Feature Selection and Clustering

Izabela Rejer

doi:10.1007/978-3-030-19738-4_16

Abstract

In this paper the modified version of cGAAM (a genetic algorithm for feature selection for clustering) is introduced. As it can be shown, the algorithm is able to find significant subsets of features in data sets that differ in size and number of classes. The common feature of the sets that were used to test the cGAAM is that the examples are provided with class labels. Due to this, although the clustering process was performed without the class labels, the chosen feature sets could be compared with feature subsets returned by Lasso method in terms of classification accuracy. The most important observation from the results presented in the paper is that the classification accuracy obtained with feature subsets returned by cGAAM was not only comparable with accuracy obtained with feature subsets returned by Lasso but almost always was higher than 80% (ionsphere dataset) and 90% (humanactivity dataset).

Full Text