A Novel Gene Ranking Algorithm Based on Random Subspace Method

Ruichu Cai,Zhifeng Hao,Wen Wen

doi:10.1109/ijcnn.2007.4370958

Abstract

Gene selection is to select the most informative genes from the whole gene set. It's an important preprocessing procedure for the discriminant analysis of microarray data, because many of the genes are irrelevant or redundant to the discriminant problem. In this paper, the gene selection problem is considered as a gene ranking problem and a random subspace method based gene ranking (RSM-GR) algorithm is proposed. In RSM-GR, firstly subsets of the genes are randomly generated; then Support Vector Machines are respectively trained on each subset and thus produce the importance factor of each gene; finally, the importance of each gene obtained from these randomly selected subsets is combined to constitute its final importance. Experiments on two public datasets show that RSM-GR obtains gene sets leading to more accurate classification results than other gene selection methods, and it demands less computational time. RSM-GR can also better deal with datasets with a large number of genes and a big number of genes to be selected.

Full Text