A hybrid feature selection method based on instance learning and cooperative subset search

Afef Ben Brahim,Mohamed Limam

doi:10.1016/j.patrec.2015.10.005

Abstract

The problem of selecting the most useful features from thousands of candidates in a low sample size data set arises in many areas of modern sciences. Feature subset selection is a key problem in such data mining classification tasks. In practice, it is very common to use filter methods. However, they ignore the correlations between genes which are prevalent in gene expression data. On the other hand, standard wrapper algorithms cannot be applied because of their complexity. Additionally, existing methods are not specially conceived to handle the small sample size of the data which is one of the main causes of feature selection instability. In order to deal with these issues, we propose a new hybrid, filter wrapper, approach based on instance learning. Its main challenge is that it converts the problem of the small sample size to a tool that allows choosing only a few subsets of features in a filter step. A cooperative subset search, CSS, is then proposed with a classifier algorithm to represent an evaluation system of wrappers. Our method is experimentally tested and compared with state-of-the-art algorithms based on several high-dimensional low sample size cancer datasets. Results show that our proposed approach outperforms other methods in terms of accuracy and stability of the selected subset.

Full Text