Graph Based Feature Selection Using Symmetrical Uncertainty in Microarray Dataset

Soodeh Bakhshandeh ,Azmi Azmi ,Mohammad Teshnehlab

doi:10.7508/jist.2019.01.004

Abstract

Microarray data with small samples and thousands of genes makes a difficult challenge for researches. Using gene selection in microarray data helps to select the most relevant genes from original dataset with the purpose of reducing the dimensionality of the microarray data as well as increasing the prediction performance. In this paper, a new gene selection method is proposed based on community detection technique and ranking the best genes. Symmetric Uncertainty is used for selection of the best genes by calculation of similarity between two genes and between each gene and class label which leads to representation of search space as a graph, in the first step. Afterwards, the proposed graph is divided into several clusters using community detection algorithm and finally, after ranking the genes, the genes with maximum ranks are selected as the best genes. This approach is a supervised/unsupervised filter-based gene selection method that minimizes the redundancy between genes and maximizes the relevance of genes and class label. Performance of the proposed method is compared with thirteen well-known unsupervised/supervised gene selection approaches over six microarray datasets using four classifiers including SVM, DT, NB and k-NN. Results show the advantages of the proposed approach.

Full Text