ANOVA-SRC-BPSO: a hybrid filter and swarm optimization-based method for gene selection and cancer classification using gene expression profiles

Salim Sazzed

doi:10.21428/594757db.9e9e0337

Abstract

Gene expression profiling reveals the activity of thousands of genes that can help to identify cancer biomarkers. However, the presence of such a large number of genes in the profiles inflicts a high computational burden on classifiers. To deal with the high-dimensional feature space, in this paper, we introduce a 3-phase feature selection framework, ANOVA-SRC-BPSO. ANOVA-SRC-BPSO first distinguishes the highly class-correlated genes utilizing the analysis of variance (ANOVA) and F-test. In the second phase, we employ Spearman rank-order correlation (SRC) to eliminate redundant genes. Finally, we leverage the binary particle swarm optimization (BPSO) with the support vector machine (SVM) classifier to select an optimized feature subset. We report the accuracy of ANOVA-SRC-BPSO utilizing the SVM classifier in seven gene expression datasets. The comparisons with fourteen state-of-the-art methods show that ANOVA-SRC-BPSO yields the highest accuracy in five datasets. Moreover, we disclose that the performances of various feature selection approaches are inconsistent across gene expression datasets.

Full Text