On the use of resampling tests for evaluating statistical significance of binding-site co-occurrence

David S Huen,Steven Russell

doi:10.1186/1471-2105-11-359

Abstract

BackgroundIn eukaryotes, most DNA-binding proteins exert their action as members of large effector complexes. The presence of these complexes are revealed in high-throughput genome-wide assays by the co-occurrence of the binding sites of different complex components. Resampling tests are one route by which the statistical significance of apparent co-occurrence can be assessed.ResultsWe have investigated two resampling approaches for evaluating the statistical significance of binding-site co-occurrence. The permutation test approach was found to yield overly favourable p-values while the independent resampling approach had the opposite effect and is of little use in practical terms. We have developed a new, pragmatically-devised hybrid approach that, when applied to the experimental results of an Polycomb/Trithorax study, yielded p-values consistent with the findings of that study. We extended our investigations to the FL method developed by Haiminen et al, which derives its null distribution from all binding sites within a dataset, and show that the p-value computed for a pair of factors by this method can depend on which other factors are included in that dataset. Both our hybrid method and the FL method appeared to yield plausible estimates of the statistical significance of co-occurrences although our hybrid method was more conservative when applied to the Polycomb/Trithorax dataset.A high-performance parallelized implementation of the hybrid method is available.ConclusionsWe propose a new resampling-based co-occurrence significance test and demonstrate that it performs as well as or better than existing methods on a large experimentally-derived dataset. We believe it can be usefully applied to data from high-throughput genome-wide techniques such as ChIP-chip or DamID. The Cooccur package, which implements our approach, accompanies this paper.

Highlights

In eukaryotes, most DNA-binding proteins exert their action as members of large effector complexes
The classical statistical methods rely on obtaining the null distribution for a statistic that correlates with the phenomenon of interest
We were interested to determine whether their framework circumvents the drawbacks we identified with a permutation test approach assessed with the hypergeometric distribution

Summary

Introduction

Most DNA-binding proteins exert their action as members of large effector complexes. The presence of these complexes are revealed in high-throughput genome-wide assays by the co-occurrence of the binding sites of different complex components. A large number of proteins are known to bind DNA in a location-specific manner. These include transcription factors, replication factors and chromatin components. When the binding sites of the individual proteins within a complex are determined by genome-wide high-throughput assays, these complexes are revealed as regions where the binding sites of multiple proteins are clustered. Many methods have been proposed for assessing the statistical significance of such clusters (reviewed in [1]). We will discuss how we have addressed these questions when considering the merit of a particular test and when devising an improved test

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jun 30, 2010
Citations: 25	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

On the use of resampling tests for evaluating statistical significance of binding-site co-occurrence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Oxidative Damage Targets Complexes Containing DNA Methyltransferases, SIRT1, and Polycomb Members to Promoter CpG Islands
Heather M O'Hagan ... Stephen B Baylin
Cancer Cell | VOL. 20
Heather M O'Hagan, et. al.Heather M O'Hagan ... Stephen B Baylin
01 Nov 2011
Cancer Cell | VOL. 20

Assembly of the TOB Complex of Mitochondria
Shukry J Habib ... Doron Rapaport
Journal of Biological Chemistry | VOL. 280
Shukry J Habib, et. al.Shukry J Habib ... Doron Rapaport
01 Feb 2005
Journal of Biological Chemistry | VOL. 280

Integrative model of genomic factors for determining binding site selection by estrogen receptor‐α
Roy Joseph ... Leena Ukil
Molecular Systems Biology | VOL. 6
Roy Joseph, et. al.Roy Joseph ... Leena Ukil
01 Jan 2009
Molecular Systems Biology | VOL. 6

Sus1, Cdc31, and the Sac3 CID Region Form a Conserved Interaction Platform that Promotes Nuclear Pore Association and mRNA Export
Divyang Jani ... Murray Stewart
Molecular Cell | VOL. 33
Divyang Jani, et. al.Divyang Jani ... Murray Stewart
01 Mar 2009
Molecular Cell | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the use of resampling tests for evaluating statistical significance of binding-site co-occurrence

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics