ChIPulate: A comprehensive ChIP-seq simulation pipeline.

Vishaka Datta,Rahul Siddharthan,Sridhar Hannenhalli,Ilya Ioshikhes

doi:10.1371/journal.pcbi.1006921

Vishaka Datta, Rahul Siddharthan + Show 2 more

Open Access

https://doi.org/10.1371/journal.pcbi.1006921

Copy DOI

Abstract

ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) is a high-throughput technique to identify genomic regions that are bound in vivo by a particular protein, e.g., a transcription factor (TF). Biological factors, such as chromatin state, indirect and cooperative binding, as well as experimental factors, such as antibody quality, cross-linking, and PCR biases, are known to affect the outcome of ChIP-seq experiments. However, the relative impact of these factors on inferences made from ChIP-seq data is not entirely clear. Here, via a detailed ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding detection, the sensitivity of inferred TF-DNA binding strength, and number of replicates needed to confidently infer binding strength. We find that the TF motif can be recovered despite poor and non-uniform extraction and PCR amplification efficiencies. The recovery of the motif is, however, affected to a larger extent by the fraction of sites that are either cooperatively or indirectly bound. Importantly, our simulations reveal that the number of ChIP-seq replicates needed to accurately measure in vivo occupancy at high-affinity sites is larger than the recommended community standards. Our results establish statistical limits on the accuracy of inferences of protein-DNA binding from ChIP-seq and suggest that increasing the mean extraction efficiency, rather than amplification efficiency, would better improve sensitivity. The source code and instructions for running ChIPulate can be found at https://github.com/vishakad/chipulate.

Highlights

ChIP-seq (Chromatin Immunoprecipitation and sequencing) is a popular high-throughput experimental technique to find locations that are bound in vivo by a single transcription factor (TF) [1]
Upon mapping of the DNA fragments bound by the TF to the reference genome, the genomic loci bound by the TF are identified as high density mapped regions or peaks, where each peak is associated with an intensity based on the number of sequenced fragments arising from it
Other studies have shown that the concentration of the target TF [6, 7], short-range cooperative interactions between the target TF and other TFs [8], and variation in chromatin accessibility [5, 7] explain the variation in intensities across peaks

Summary

Introduction

ChIP-seq (Chromatin Immunoprecipitation and sequencing) is a popular high-throughput experimental technique to find locations that are bound in vivo by a single transcription factor (TF) [1]. Several studies of ChIP-seq data have focussed on the biological factors distinguishing the loci bound by the TF. Some of the variation can arise due to indirect binding, where the target TF binds DNA indirectly via a second DNA-bound TF [9,10,11]. The intensity of such peaks is no longer directly dependent on the affinity of the target TF to sequence at the bound locus

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS computational biology	Publication Date: Mar 21, 2019
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ChIPulate: A comprehensive ChIP-seq simulation pipeline.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS computational biology

Lead the way for us

Similar Papers

Asymmetric Conservation within Pairs of Co-Occurred Motifs Mediates Weak Direct Binding of Transcription Factors in ChIP-Seq Data.
Victor Levitsky ... Elena Zemlyanskaya
International Journal of Molecular Sciences | VOL. 21
Victor Levitsky, et. al.Victor Levitsky ... Elena Zemlyanskaya
21 Aug 2020
International Journal of Molecular Sciences | VOL. 21

Author response: A genome-wide view of the de-differentiation of central nervous system endothelial cells in culture
Mark F Sabbagh ... Jeremy Nathans
-
Mark F Sabbagh, et. al.Mark F Sabbagh ... Jeremy Nathans
20 Nov 2019
20 Nov 2019

Transcription Factor Binding Affinities and DNA Shape Readout.
Max Schnepf ... Christophe Jung
iScience | VOL. 23
Max Schnepf, et. al.Max Schnepf ... Christophe Jung
15 Oct 2020
iScience | VOL. 23

Abstract 29: Integrating computational epigenetic and statistical approaches to investigate how genome-wide transcription factor (TF)-DNA bindings affect breast cancer risk
Wanqing Wen ... Wei Zheng
American Journal of Cancer | VOL. 80
Wanqing Wen, et. al.Wanqing Wen ... Wei Zheng
13 Aug 2020
American Journal of Cancer | VOL. 80

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ChIPulate: A comprehensive ChIP-seq simulation pipeline.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS computational biology