Abstract

BackgroundChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are almost always neglected, despite the remarkable differences shown by ChIP-seq for different proteins, as well as by distinct regions in a single experiment.ResultsWe hypothesize that statistically significant differences in peak shape might have a functional role and a biological meaning. Thus, we design five indices able to summarize peak shapes and we employ multivariate clustering techniques to divide peaks into groups according to both their complexity and the intensity of their coverage function. In addition, our novel analysis pipeline employs a range of statistical and bioinformatics techniques to relate the obtained peak shapes to several independent genomic datasets, including other genome-wide protein-DNA maps and gene expression experiments. To clarify the meaning of peak shape, we apply our methodology to the study of the erythroid transcription factor GATA-1 in K562 cell line and in megakaryocytes.ConclusionsOur study demonstrates that ChIP-seq profiles include information regarding the binding of other proteins beside the one used for precipitation. In particular, peak shape provides new insights into cooperative transcriptional regulation and is correlated to gene expression.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0787-6) contains supplementary material, which is available to authorized users.

Highlights

  • ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications

  • By applying the proposed analysis pipeline to ChIP-seq experiments for the transcription factor GATA-1 in K562 cells and in primary human megakaryocytes, we have demonstrated that statistically significant different peak shapes are correlated with several cooperative transcriptional regulators

  • We have shown that GATA-1 peak shape is associated with characteristic regulatory complexes and changes in gene expression profiles

Read more

Summary

Introduction

ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a widely used technique essential to study transcription factor binding and chromatin modifications This technique has been largely used to characterize many biological processes, enabling the creation of valuable public resources of epigenomic data (i.e. ENCODE, Roadmap Epigenomics). Due to the importance of interpreting these datasets, a large number of algorithms for downstream processing of ChIP-seq experiments have been developed [1, 2] All these methods are usually based on the evaluation of signal intensities to detect local enrichment of uniquely aligned reads on the reference genome (we refer to them as ‘ChIP-seq peaks’). Our hypothesis is that peak shape is influenced by the organization and interactions of the proteins bound to the DNA, we want to understand if the detection of differences in peak shape in a single ChIP-seq

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call