Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments

Javier Apolloni,Guillermo Leguizamón,Enrique Alba

doi:10.1016/j.asoc.2015.10.037

Abstract

Microarray experiments generally deal with complex and high-dimensional samples, and in addition, the number of samples is much smaller than their dimensions. Both issues can be alleviated by using a feature selection (FS) method. In this paper two new, simple, and efficient hybrid FS algorithms, called respectively BDE-XRank and BDE-XRankf, are presented. Both algorithms combine a wrapper FS method based on a Binary Differential Evolution (BDE) algorithm with a rank-based filter FS method. Besides, they generate the initial population with solutions involving only a small number of features. Some initial solutions are built considering only the most relevant features regarding the filter method, and the remaining ones include only random features (to promote diversity). In the BDE-XRankf, a new fitness function, in which the score value of a solution is influenced by the frequency of the features in the current population, is incorporated in the algorithm. The robustness of BDE-XRank and BDE-XRankf is shown by using four Machine Learning (ML) algorithms (NB, SVM, C4.5, and kNN). Six high-dimensional well-known data sets of microarray experiments are used to carry out an extensive experimental study based on statistical tests. This experimental analysis shows the robustness as well as the ability of both proposals to obtain highly accurate solutions at the earlier stages of BDE evolutionary process. Finally, BDE-XRank and BDE-XRankf are also compared against the results of nine state-of-the-art algorithms to highlight its competitiveness and the ability to successfully reduce the original feature set size by more than 99%.

Full Text