Gene selection and classification from microarray data using kernel machine

Ji-Hoon Cho,Dongkwon Lee,Jin Hyun Park,In-Beum Lee

doi:10.1016/j.febslet.2004.05.087

Abstract

The discrimination of cancer patients (including subtypes) based on gene expression data is a critical problem with clinical ramifications. Central to solving this problem is the issue of how to extract the most relevant genes from the several thousand genes on a typical microarray. Here, we propose a methodology that can effectively select an informative subset of genes and classify the subtypes (or patients) of disease using the selected genes. We employ a kernel machine, kernel Fisher discriminant analysis (KFDA), for discrimination and use the derivatives of the kernel function to perform gene selection. Using a modified form of KFDA in the minimum squared error (MSE) sense and the gradients of the kernel functions, we construct an effective gene selection criterion. We assess the performance of the proposed methodology by applying it to three gene expression datasets: leukemia dataset, breast cancer dataset and colon cancer dataset. Using a few informative genes, the proposed method accurately and reliably classified cancer subtypes (or patients). Also, through a comparison study, we verify the reliability of the gene selection and discrimination results.

Full Text