Clustering analysis of microarray gene expression data by splitting algorithm

Ruye Wang,Lucas Scharenbroich,Christopher Hart,Barbara Wold,Eric Mjolsness

doi:10.1016/s0743-7315(03)00085-6

Abstract

A clustering method based on recursive bisection is introduced for analyzing microarray gene expression data. Either or both dimensions for the genes and the samples of a given microarray dataset can be classified in an unsupervised fashion. Alternatively, if certain prior knowledge of the genes or samples is available, a supervised version of the clustering analysis can also be carried out. Either approach may be used to generate a partial or complete binary hierarchy, the dendrogram, showing the underlying structure of the dataset. Compared to other existing clustering methods used for microarray data analysis (such as hierarchical and K-means), the method presented here has the advantage of much improved computational efficiency while retaining effective separation of data clusters under a distance metric, a straightforward parallel implementation, and useful extraction and presentation of biological information. Clustering results of both synthesized and experimental microarray data are presented to demonstrate the performance of the algorithm.

Full Text