Feature selection for multi-class problems by using pairwise-class and all-class techniques

Mingyu You,Guo-Zheng Li

doi:10.1080/03081079.2010.530027

Abstract

Feature selection has been a key technology in massive data processing, e.g. in microarray data analysis with few samples but high-dimensional genes. One common problem in multi-class microarray data analysis is the unbalanced recognition or prediction accuracies among classes, which usually leads to poor system performance. One of the main reasons is the unfair feature (gene) selection method. In this paper, a novel feature selection framework by using pairwise-class and all-class techniques (namely FrPA) is proposed to balance the performance among classes and improve the average accuracy. The feature (gene) rank list on all classes and the lists on each pair of classes are all taken into consideration during feature selection. The strategy of round-robin is embedded into the framework to select final features from the different rank lists. Experimental results on several microarray data sets show that FrPA helps to achieve higher classification accuracy and balance the performance among classes.

Full Text