Evaluation of gene-expression clustering via mutual information distance measure.

Ido Priness,Irad Ben-Gal,Oded Maimon

doi:10.1186/1471-2105-8-111

Ido Priness, Irad Ben-Gal + Show 1 more

Open Access

https://doi.org/10.1186/1471-2105-8-111

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Mar 30, 2007
Citations: 196	License type: cc-by

Affiliation: Tel Aviv University

Abstract

BackgroundThe definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI) measure versus the use of the well known Euclidean distance and Pearson correlation coefficient.ResultsRelying on several public gene expression datasets, we evaluate the homogeneity and separation scores of different clustering solutions. It was found that the use of the MI measure yields a more significant differentiation among erroneous clustering solutions. The proposed measure was also used to analyze the performance of several known clustering algorithms. A comparative study of these algorithms reveals that their "best solutions" are ranked almost oppositely when using different distance measures, despite the found correspondence between these measures when analysing the averaged scores of groups of solutions.ConclusionIn view of the results, further attention should be paid to the selection of a proper distance measure for analyzing the clustering of gene expression data.

Highlights

The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles
The results show that the sIB algorithm [32,33], which is originally based on a mutual-information criterion, obtains better Mutual Information (MI)-based homogeneity and separation scores than those provided by the K-means, the CLICK and the SOM algorithms [5,21]
In the first experiment, which is based on known clustering solutions, we show the statistical superiority of the average MIbased measure independently of the selected clustering algorithm

Summary

Introduction

The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI) measure versus the use of the well known Euclidean distance and Pearson correlation coefficient. Clustering is a central analysis method of gene-expressions that has been implemented extensively in various works and applications [1,2,3,4,5]. The primary goal is to cluster together genes or tissues that manifest similar expression patterns [1]. Similar expression patterns might offer insights into various transcriptional and biological processes [6,7,8]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of gene-expression clustering via mutual information distance measure.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

A study and characterization of chemical properties of soil surface data using K-means algorithm
D A Kumar ... N Kannathasan
-
D A Kumar, et. al.D A Kumar ... N Kannathasan
01 Feb 2013
01 Feb 2013

Some Intuitionist Fuzzy Weighted Geometric Distance Measures and Their Application to Group Decision Making
Bo Peng ... Chunming Ye
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | VOL. 22
Bo Peng, et. al.Bo Peng ... Chunming Ye
01 Oct 2014
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems | VOL. 22

Effective clustering of microarray gene expression data using signal processing and soft computing methods
Purnendu Mishra ... Jayakishan Meher
-
Purnendu Mishra, et. al.Purnendu Mishra ... Jayakishan Meher
01 Jan 2015
01 Jan 2015

Comparison between the Applications of Fragment-Based and Vertex-Based GPU Approaches in K-Means Clustering of Time Series Gene Expression Data
Yau-King Lam ... Yi Xiao
-
Yau-King Lam, et. al.Yau-King Lam ... Yi Xiao
01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of gene-expression clustering via mutual information distance measure.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics