Abstract

The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost.

Highlights

  • Previous studies have shown that gene microarray data analysis is a powerful and revolutionary tool for biological and medical researches by allowing the simultaneous monitoring of the expression levels of tens of thousands of genes [1].This is done by measuring the signal intensity of fluorescing molecules attached to DNA species that are bound to complementary strands of DNA localized to the surface of the microarray

  • Redundancy is computed by the mutual information (MI) between pairs of features whereas relevance is measured by the MI between each feature and the class labels

  • We addressed RotBoost ensemble classification method to cope with gene microarray classification problems

Read more

Summary

Introduction

Previous studies have shown that gene microarray data analysis is a powerful and revolutionary tool for biological and medical researches by allowing the simultaneous monitoring of the expression levels of tens of thousands of genes [1]. This is done by measuring the signal intensity of fluorescing molecules attached to DNA species that are bound to complementary strands of DNA localized to the surface of the microarray. Application of microarrays to the study of human disease conditions rapidly revealed their potential as a medical diagnostic tool [3, 4] This is a class prediction problem to which supervised learning techniques are ideally suited

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call