Abstract

The development of microarray technology has supplied a large volume of data to many fields. The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. In as much as the data achieving from microarray technology is very noisy and also has thousands of features, feature selection plays an important role in removing irrelevant and redundant features and also reducing computational complexity. There are two important approaches for gene selection in microarray data analysis, the filters and the wrappers. To select a concise subset of informative genes, we introduce a hybrid feature selection which combines two approaches. The fact of the matter is that candidateā€™s features are first selected from the original set via several effective filters. The candidate feature set is further refined by more accurate wrappers. Thus, we can take advantage of both the filters and wrappers. Experimental results based on 11 microarray datasets show that our mechanism can be effected with a smaller feature set. Moreover, these feature subsets can be obtained in a reasonable time.

Highlights

  • Microarray technology has provided the ability to measure the expression level of thousands of gene simultaneously in a single experiment

  • We addressed an ensemble of filters and wrappers to cope with gene microarray classification problems

  • The idea is to utilize the efficiency of filters and the accuracy of wrappers

Read more

Summary

Introduction

Microarray technology has provided the ability to measure the expression level of thousands of gene simultaneously in a single experiment. With a certain number of samples, investigations can be made into whether there are patterns or dissimilarities across samples of different type including cancerous versus normal, or even within subtype of diseases [1]. Microarray analysis has been challenged by its high number of features (genes) and the small sample sizes (for example lung dataset [2] contains 12535 genes and only 181 samples). To avoid the curse of dimensionality problem, gene selection plays a crucial role in DNA microarray analysis. Another important reason to reduce dimensionality is to help biologists to identify the underlying mechanism that relates gene expression to diseases

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call