A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval

Naomi Habib,Hanah Margalit,Nir Friedman,Tommy Kaplan

doi:10.1371/journal.pcbi.1000010

Naomi Habib, Hanah Margalit + Show 2 more

Open Access

https://doi.org/10.1371/journal.pcbi.1000010

Copy DOI

Journal: PLoS Computational Biology	Publication Date: Feb 29, 2008
Citations: 95	License type: CC BY 4.0

Affiliation: Hebrew University of Jerusalem

Abstract

Characterizing the DNA-binding specificities of transcription factors is a key problem in computational biology that has been addressed by multiple algorithms. These usually take as input sequences that are putatively bound by the same factor and output one or more DNA motifs. A common practice is to apply several such algorithms simultaneously to improve coverage at the price of redundancy. In interpreting such results, two tasks are crucial: clustering of redundant motifs, and attributing the motifs to transcription factors by retrieval of similar motifs from previously characterized motif libraries. Both tasks inherently involve motif comparison. Here we present a novel method for comparing and merging motifs, based on Bayesian probabilistic principles. This method takes into account both the similarity in positional nucleotide distributions of the two motifs and their dissimilarity to the background distribution. We demonstrate the use of the new comparison method as a basis for motif clustering and retrieval procedures, and compare it to several commonly used alternatives. Our results show that the new method outperforms other available methods in accuracy and sensitivity. We incorporated the resulting motif clustering and retrieval procedures in a large-scale automated pipeline for analyzing DNA motifs. This pipeline integrates the results of various DNA motif discovery algorithms and automatically merges redundant motifs from multiple training sets into a coherent annotated library of motifs. Application of this pipeline to recent genome-wide transcription factor location data in S. cerevisiae successfully identified DNA motifs in a manner that is as good as semi-automated analysis reported in the literature. Moreover, we show how this analysis elucidates the mechanisms of condition-specific preferences of transcription factors.

Highlights

Transcription initiation is modulated by transcription factors that recognize sequence-specific binding sites in regulatory regions
It is crucial to combine similar motifs and to relate them to transcription factors. To this end we developed an accurate fully-automated method, termed Bayesian Likelihood 2-Component (BLiC), based upon an improved similarity measure for comparing DNA motifs
By applying it to genome-wide data in yeast, we identified the DNA motifs of transcription factors and their putative target genes

Summary

Introduction

Transcription initiation is modulated by transcription factors that recognize sequence-specific binding sites in regulatory regions. The organization of binding sites around a gene specifies which factors can bind to it and where, and determines to what extent the gene is transcribed under different conditions. To understand this regulatory mechanism, one must specify the DNA binding preferences of transcription factors. In large-scale experiments, where the motif output set is very large, the tasks of scoring, merging and identifying motifs need to be automated. To address both the clustering and the retrieval challenges, we need an accurate and sensitive method for comparing DNA motifs

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology

Lead the way for us

Similar Papers

MADS specificity : Unravelling the dual function of the MADS domain protein FRUITFULL
Hilda Van Mourik
-
Hilda Van MourikHilda Van Mourik
10 Nov 2019
10 Nov 2019

In vitro DNA-binding profile of transcription factors: methods and new insights
Jinke Wang ... Yingxun Liu
Journal of Endocrinology | VOL. 210
Jinke Wang, et. al.Jinke Wang ... Yingxun Liu
09 Mar 2011
Journal of Endocrinology | VOL. 210

TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Lin Yang ... Remo Rohs
Nucleic Acids Research | VOL. 42
Lin Yang, et. al.Lin Yang ... Remo Rohs
07 Nov 2013
Nucleic Acids Research | VOL. 42

High-Throughput Analysis of Protein-DNA Binding Affinity
José M. Franco-Zorrilla ... Roberto Solano
-
José M. Franco-Zorrilla, et. al.José M. Franco-Zorrilla ... Roberto Solano
30 Aug 2013
30 Aug 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS Computational Biology