Abstract

Feature selection is the process of selecting an optimal subset of features required for maintaining or improving the performance of data mining models. Recently, hybrid filter/wrapper feature selection methods have shown promising results for high-dimensional data. However, filter/wrapper methods lack of generalisation power, which enables the selected features to be trainable over different classifiers without having to repeat the feature selection process. To address the generalisation power problem, this paper proposes a novel evolutionary-based filter feature selection algorithm that is sequentially hybridised with the Fisher score filter algorithm in a new hybrid framework called filter/filter. The proposed algorithm is based on a long-term memory Tabu Search combined with an Asexual (i.e. mutation-based) Genetic Algorithm (TAGA). TAGA benefits from a new integer-encoded solution representation, a novel mutation operator, a new tabu list encoding scheme, and uses a minimum redundancy maximum relevance information theory-based criterion as the fitness function. Experiments were carried out on various high-dimensional datasets including image, text, and biological data. The goodness of the selected subsets was evaluated using different classifiers and the experimental results demonstrate that TAGA outperforms other conventional and state-of-the-art feature selection algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call