A Bayesian Search for Transcriptional Motifs

Andrew K Miller,Poul M F Nielsen,Edmund J Crampin,Cristin G Print

doi:10.1371/journal.pone.0013897

Andrew K Miller, Poul M F Nielsen + Show 2 more

Open Access

https://doi.org/10.1371/journal.pone.0013897

Copy DOI

Journal: PLoS ONE	Publication Date: Nov 18, 2010
Citations: 15	License type: CC BY 4.0

Affiliation: University of Auckland

Abstract

Identifying transcription factor (TF) binding sites (TFBSs) is an important step towards understanding transcriptional regulation. A common approach is to use gaplessly aligned, experimentally supported TFBSs for a particular TF, and algorithmically search for more occurrences of the same TFBSs. The largest publicly available databases of TF binding specificities contain models which are represented as position weight matrices (PWM). There are other methods using more sophisticated representations, but these have more limited databases, or aren't publicly available. Therefore, this paper focuses on methods that search using one PWM per TF. An algorithm, MATCHTM, for identifying TFBSs corresponding to a particular PWM is available, but is not based on a rigorous statistical model of TF binding, making it difficult to interpret or adjust the parameters and output of the algorithm. Furthermore, there is no public description of the algorithm sufficient to exactly reproduce it. Another algorithm, MAST, computes a p-value for the presence of a TFBS using true probabilities of finding each base at each offset from that position. We developed a statistical model, BaSeTraM, for the binding of TFs to TFBSs, taking into account random variation in the base present at each position within a TFBS. Treating the counts in the matrices and the sequences of sites as random variables, we combine this TFBS composition model with a background model to obtain a Bayesian classifier. We implemented our classifier in a package (SBaSeTraM). We tested SBaSeTraM against a MATCHTM implementation by searching all probes used in an experimental Saccharomyces cerevisiae TF binding dataset, and comparing our predictions to the data. We found no statistically significant differences in sensitivity between the algorithms (at fixed selectivity), indicating that SBaSeTraM's performance is at least comparable to the leading currently available algorithm. Our software is freely available at: http://wiki.github.com/A1kmm/sbasetram/building-the-tools.

Highlights

Identifying which transcription factors bind to which promoters is an important step towards understanding the transcriptional regulatory code
Methods for finding transcription factors can be broadly classified as those based on phylogenetic footprinting, and those which are not
Polypeptide coding sequences are considered background, and the distribution of the sequence of bases is determined by the effect of the polypeptide sequence on evolutionary fitness; something which would require more knowledge about biological function than is available, and is too complex to include in the background model

Summary

Introduction

Identifying which transcription factors bind to which promoters is an important step towards understanding the transcriptional regulatory code. Methods for finding transcription factors (as motifs which are statistically overrepresented in sequences) can be broadly classified as those based on phylogenetic footprinting, and those which are not. These methods have been widely compared [1,2] and reviewed [3]. Polypeptide coding sequences are considered background, and the distribution of the sequence of bases is determined by the effect of the polypeptide sequence on evolutionary fitness; something which would require more knowledge about biological function than is available, and is too complex to include in the background model

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Bayesian Search for Transcriptional Motifs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Decision letter: Promoter sequence and architecture determine expression variability and confer robustness to genetic variants
George H Perry
-
George H PerryGeorge H Perry
07 Sep 2022
07 Sep 2022

Author response: Promoter sequence and architecture determine expression variability and confer robustness to genetic variants
Hjörleifur Einarsson ... Marco Salvatore
-
Hjörleifur Einarsson, et. al.Hjörleifur Einarsson ... Marco Salvatore
03 Nov 2022
03 Nov 2022

Transcription factor binding site clusters identify target genes with similar tissue-wide expression and buffer against mutations
Peter K Rogan ... Peter Rogan
F1000Research | VOL. 7
Peter K Rogan, et. al.Peter K Rogan ... Peter Rogan
25 Mar 2019
F1000Research | VOL. 7

Transcription factor binding site clusters identify target genes with similar tissue-wide expression and buffer against mutations.
Ruipeng Lu ... Peter K Rogan
F1000Research | VOL. 7
Ruipeng Lu, et. al.Ruipeng Lu ... Peter K Rogan
08 Apr 2019
F1000Research | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Bayesian Search for Transcriptional Motifs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE