Non-deterministic Local Search Methods for Feature Selection: An Experimental Study

Marina P Fernandez-Perez,Felix F Gonzalez-Navarro

doi:10.1109/micai.2014.16

Abstract

The dimensionality reduction by feature selection is one of the fundamental steps in the pre-processing data stage in the intelligent data analysis. Feature selection (FS) literature embodies a wide spectrum of algorithms, methods and strategies, but mostly all fall into two classes, the well known wrappers and filters. The decision of which feature or variable is selected or discarded from the best current subset is still subject of research nowadays. In this paper, an experimental study about non-deterministic local search methods as main engine to this decision making is presented. The Simulated Annealing Algorithm, the Genetic Algorithm, the Tabu Search and the Threshold Accepting Algorithm are analyzed. They are used to select subset of features on several real and artificial data sets with different configurations -- i.e. Continuous and discrete data, high-low number of cases/features -- in a wrapper fashion. The Nearest Neighbor Classifier, the Linear and Quadratic Discriminant Classifier, the Naive Bayes classifier and the Support Vector Machine are evaluated as the performance function in the wrapper scheme.

Full Text