Compound Poisson Approximation of the Number of Occurrences of a Position Frequency Matrix (PFM) on Both Strands

Utz J Pape,Sven Rahmann,Fengzhu Sun,Martin Vingron

doi:10.1089/cmb.2007.0084

Abstract

Transcription factors play a key role in gene regulation by interacting with specific binding sites or motifs. Therefore, enrichment of binding motifs is important for genome annotation and efficient computation of the statistical significance, the p-value, of the enrichment of motifs is crucial. We propose an efficient approximation to compute the significance. Due to the incorporation of both strands of the DNA molecules and explicit modeling of dependencies between overlapping hits, we achieve accurate results for any DNA motif based on its Position Frequency Matrix (PFM) representation. The accuracy of the p-value approximation is shown by comparison with the simulated count distribution. Furthermore, we compare the approach with a binomial approximation, (compound) Poisson approximation, and a normal approximation. In general, our approach outperforms these approximations or is equally good but significantly faster. An implementation of our approach is available at http://mosta.molgen.mpg.de.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Compound Poisson Approximation of the Number of Occurrences of a Position Frequency Matrix (PFM) on Both Strands

Abstract

Talk to us

Similar Papers

More From: Journal of computational biology : a journal of computational molecular cell biology

Lead the way for us

Journal: Journal of computational biology : a journal of computational molecular cell biology	Publication Date: Jul 1, 2008
Citations: 34

Similar Papers

Natural similarity measures between position frequency matrices with an application to clustering
Utz J Pape ... Martin Vingron
Computer applications in the biosciences : CABIOS | VOL. 24
Utz J Pape, et. al.Utz J Pape ... Martin Vingron
02 Jan 2008
Computer applications in the biosciences : CABIOS | VOL. 24

Transcription Factors and DNA Play Hide and Seek.
David M Suter
Trends in cell biology | VOL. 30
David M SuterDavid M Suter
07 Apr 2020
Trends in cell biology | VOL. 30

DNA Motif Match Statistics Without Poisson Approximation.
Wolfgang Kopp ... Martin Vingron
Journal of computational biology : a journal of computational molecular cell biology | VOL. 26
Wolfgang Kopp, et. al.Wolfgang Kopp ... Martin Vingron
17 Apr 2019
Journal of computational biology : a journal of computational molecular cell biology | VOL. 26

An Estimation of Distribution Algorithm for Motif Discovery
Gang Li ... Kin-Hong Lee
-
Gang Li, et. al. Gang Li ... Kin-Hong Lee
01 Jun 2008
01 Jun 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Compound Poisson Approximation of the Number of Occurrences of a Position Frequency Matrix (PFM) on Both Strands

Abstract

Talk to us

Similar Papers

More From: Journal of computational biology : a journal of computational molecular cell biology