Quantitative Evaluation of Established Clustering Methods for Gene Expression Data

Dörte Radke,Ulrich Möller

doi:10.1007/978-3-540-30547-7_40

Abstract

Analysis of gene expression data generated by microarray techniques often includes clustering. Although more reliable methods are available, hierarchical algorithms are still frequently employed. We clustered several data sets and quantitatively compared the performance of an agglomerative hierarchical approach using the average-linkage method with two partitioning procedures, k-means and fuzzy c-means. Investigation of the results revealed the superiority of the partitioning algorithms: the compactness of the clusters was markedly increased and the arrangement of the profiles into clusters more closely resembled biological categories. Therefore, we encourage analysts to critically scrutinize the results obtained by clustering.KeywordsGene Expression DataDissimilarity MeasureAverage DissimilarityCluster Validity IndexHierarchical AlgorithmThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text