Abstract

The recent advance in the microarray data analysis makes it easy to simultaneously measure the expression levels of several thousand genes. These levels can be used to distinguish cancerous tissues from normal ones. In this work, we are interested in gene expression data dimension reduction for cancer classification, which is a common task in most microarray data analysis studies. This reduction has an essential role in enhancing the accuracy of the classification task and helping biologists accurately predict cancer in the body; this is carried out by selecting a small subset of relevant genes and eliminating the redundant or noisy genes. In this context, we propose a hybrid approach (MWIS-ACO-LS) for the gene selection problem, based on the combination of a new graph-based approach for gene selection (MWIS), in which we seek to minimize the redundancy between genes by considering the correlation between the latter and maximize gene-ranking (Fisher) scores, and a modified ACO coupled with a local search (LS) algorithm using the classifier 1NN for measuring the quality of the candidate subsets. In order to evaluate the proposed method, we tested MWIS-ACO-LS on ten well-replicated microarray datasets of high dimensions varying from 2308 to 12600 genes. The experimental results based on ten high-dimensional microarray classification problems demonstrated the effectiveness of our proposed method.

Highlights

  • Is technology developed in the early 1990s allowed researchers to simultaneously measure the expression levels of several thousand genes [1, 2], ese levels of expression are very important for the detection or classification of the specific tumor type. e microarray data is transformed into gene expression matrices, where a row represents an experimental condition and column represents a gene; each value of xij is the measure of the level of expression of the jth gene in the ith sample

  • E process begins by transforming the initial dataset into a vertex-weighted graph (Algorithm 1), where we search the maximum weight independent set (MWIS), which is well-known as an NP-hard problem, so we have proposed a greedy algorithm (Algorithm 2) to find a near-optimal set of vertices. e subset of genes selected in the later stage is taken as input into the second stage of selection, which used an evolutionary algorithm (ACO), combined with a local search algorithm to select the minimum number of genes that gives the maximum classification accuracy for the 1NN classifier

  • In order to limit the search space and accelerate the speed of convergence of our proposed approach, the first subset of genes was selected based on a graph-theory algorithm for gene selection (MWIS), and a modified ACO-1NN coupled with a local search algorithm was applied to find more excellent subset of genes. e quality of a candidate subset is measured by the performance of the K nearest neighbor (KNN) classifier obtained using LOOCV and the size of this subset

Read more

Summary

Introduction

DNA microarray technology has grown tremendously, thanks to its unquestionable scientific merit. E first works on the DNA microarray classification were published at the end of the 1990s [13, 14] In this context, several researchers have utilized metaheuristic methods and the ACO algorithm for solving the feature selection problem ( gene selection), in order to facilitate recognition of cancer cells: ACO [15,16,17,18,19,20] algorithm, PSO [4, 6, 21,22,23,24,25] genetic algorithm [4, 26, 27], incorporating imperialist competition algorithm (ICA) [28], and binary differential evolution (BDE) algorithm [29]. A wrapper method based on a modified ACO and a new local search algorithm guided by the 1NN classifier is developed In this step, the role of 1NN is to evaluate each candidate gene subset generated.

Graph eory Approach for Gene Selection
6: Step 1
Discussion
Conclusion
Method
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call