Abstract

Classification of microarray gene-expression data can potentially help medical diagnosis, and becomes an important topic in bioinformatics. However, microarray data sets are usually of small sample size relative to an overwhelming number of genes. This makes the classification problem fairly challenging. Instance-based learning (IBL) algorithms, such as nearest neighbor ( k-NN), are usually the baseline algorithm due to their simplicity. However, practices show that k-NN performs not very well in this field. This paper introduces manifold-based metric learning to improve the performance of IBL methods. A novel metric learning algorithm is proposed by utilizing both local manifold structural information and local discriminant information. In addition, a random subspace extension is also presented. We apply the proposed algorithm to the gene-classification problem in three ways: one in the original feature space, another in the reduced feature space, and the third via the random subspace extension. Statistical evaluation shows that the proposed algorithm can achieve promising results, and gain significant performance improvement over traditional IBL algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call