Abstract

Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes.

Highlights

  • DNA microarray technology which can measure the expression levels of thousands of genes simultaneously in the field of biological tissues and produce databases of cancer based on gene expression data [1] has great potential on cancer research

  • BioMed Research International strategy rather than local search strategy have shown their advantages in solving combinatorial optimization problems, and a number of metaheuristic approaches have been applied on feature selection, for example, genetic algorithm (GA), particle swarm optimization (PSO), tabu search (TS), and artificial bee colony (ABC)

  • The experiment results included classification accuracy and the number of selected feature genes obtained by HICATS over 10 independent runs for datasets included Tumors, 9 Tumors, SRBCT, Leukemia 1, Leukemia 2, DLBCL, Prostate Tumor, Lung Cancer, Brain Tumors 1, and Brain Tumors 2 which are shown in Tables 4, 5, and 6

Read more

Summary

Introduction

DNA microarray technology which can measure the expression levels of thousands of genes simultaneously in the field of biological tissues and produce databases of cancer based on gene expression data [1] has great potential on cancer research. BioMed Research International strategy rather than local search strategy have shown their advantages in solving combinatorial optimization problems, and a number of metaheuristic approaches have been applied on feature selection, for example, genetic algorithm (GA), particle swarm optimization (PSO), tabu search (TS), and artificial bee colony (ABC). In this paper, we concentrate on imperialist competition algorithm inspired by sociopolitical behavior which is a kind of new swarm intelligent optimization algorithms to address the process of feature selection from gene expression data. It starts with an initial population and effectively searches the solution space through some specially designed operators to converge to optimal or near-optimal solution.

Related Algorithm
Proposed HICATS
Experiment
11 Tumors
Methods
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call