Abstract

The goal of the feature selection process is, given a dataset described by n attributes (features), to find the minimum number m of relevant attributes which describe the data as well as the original set of attributes do. Genetic algorithms have been used to implement feature selection algorithms. Previous algorithms presented in the literature used the predictive accuracy of a specific learning algorithm as the fitness function to maximize over the space of possible feature subsets. Such an approach to feature selection requires a large amount of CPU time to reach a good solution on large datasets. This paper presents a genetic algorithm for feature selection which improves previous results presented in the literature for genetic-based feature selection. It is independent of a specific learning algorithm and requires less CPU time to reach a relevant subset of features. Reported experiments show that the proposed algorithm is at least ten times faster than a standard genetic algorithm for feature selection without a loss of predictive accuracy when a learning algorithm is applied to reduced data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call