A novel noise handling method to improve clustering of gene expression patterns

Anindya Bhattacharya,Rajat K De

doi:10.1186/1471-2105-12-s7-a3

Abstract

Background Cluster analysis of gene expression data is a useful tool for identifying biologically relevant groups of genes that show similar expression patterns under multiple experimental conditions. Performance of clustering algorithms is largely dependent on selected similarity measure. Efficiency in handling outliers is a major contributor to the success of a similarity measure. In gene expression data, there may be pairs of genes that have completely different expression values over a few samples under certain experimental condition(s), although they exhibit similar behavior over the other samples. Depending on the algorithms, these outliers are either placed in single element clusters (hierarchical clustering), are allowed to be in a cluster that is more similar compared to others (partitioning clustering) or they may be completely discarded from grouping (density-based, grid-based and graph-based clustering). In all these cases outliers affect the outcome of a clustering result. Measurement errors or conditional changes during microarray experiments may cause a single sample, if not more, differing in expression level to a great extent compared to the other samples. Expression value of the single or a very few outlier samples may cause a gene to be an outlier. We formulate a new weighted function based method to reduce the effect of outliers on similarity measures. The better the similarity measure is in measuring similarity between genes in the presence of outliers, the better the

Highlights

Cluster analysis of gene expression data is a useful tool for identifying biologically relevant groups of genes that show similar expression patterns under multiple experimental conditions
We formulate a new weighted function based method to reduce the effect of outliers on similarity measures
* Correspondence: anindyamail@rediffmail.com 1Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, TN, 38163, USA Full list of author information is available at the end of the article performance of the clustering algorithm will be in forming biologically relevant groups of genes

Summary

Introduction

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Aug 5, 2011
Citations: 5	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

A novel noise handling method to improve clustering of gene expression patterns

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Comparative Analysis of Different Label-Free Mass Spectrometry Based Protein Abundance Estimates and Their Correlation with RNA-Seq Gene Expression Data
Kang Ning ... Damian Fermin
Journal of Proteome Research | VOL. 11
Kang Ning, et. al.Kang Ning ... Damian Fermin
29 Feb 2012
Journal of Proteome Research | VOL. 11

EXP-PAC: Providing comparative analysis and storage of next generation gene expression data
Philip C Church ... Christophe Lefèvre
Genomics | VOL. 100
Philip C Church, et. al.Philip C Church ... Christophe Lefèvre
15 May 2012
Genomics | VOL. 100

Bayesian Analysis of Gene Expression Data
Bani K Mallick ... David Lee Gold
-
Bani K Mallick, et. al.Bani K Mallick ... David Lee Gold
24 Jul 2009
24 Jul 2009

Knowledge-assisted recognition of cluster boundaries in gene expression data
Yoshifumi Okada ... Tomomasa Nagashima
Artificial Intelligence In Medicine | VOL. 35
Yoshifumi Okada, et. al.Yoshifumi Okada ... Tomomasa Nagashima
27 Jul 2005
Artificial Intelligence In Medicine | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel noise handling method to improve clustering of gene expression patterns

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics