Abstract

BackgroundCorrectly identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles. Conceptually, these regions are divided into two categories, narrow peaks and broad domains, and different algorithms are used to identify each one. Datasets that span these two categories are often analyzed with a single program for peak calling combined with an ad hoc method for domains.ResultsWe developed hiddenDomains, which identifies both peaks and domains, and compare it to the leading algorithms using H3K27me3, H3K36me3, GABP, ESR1 and FOXA ChIP-seq datasets. The output from the programs was compared to qPCR-validated enriched and depleted sites, predicted transcription factor binding sites, and highly-transcribed gene bodies. With every method, hiddenDomains, performed as well as, if not better than algorithms dedicated to a specific type of analysis.ConclusionshiddenDomains performs as well as the best domain and peak calling algorithms, making it ideal for analyzing ChIP-seq datasets, especially those that contain a mixture of peaks and domains.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0991-z) contains supplementary material, which is available to authorized users.

Highlights

  • Identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles

  • Using ChIP-seq datatsets for H3K27me3, GA-binding protein (GABP), Estrogen Receptor 1 (ESR1) and Forkhead Box A1 (FOXA1), we have shown that hiddenDomains’s sensitivities and specificities are among the best, if not better than, methods that are dedicated to identifying broad domains or narrow peaks

  • We have shown that a larger percentage of hiddenDomains’s GABP, ESR1 and FOXA1 results overlap predicted binding sites than any other method using the default bin size (1 kb) and much smaller, 212 and 200 bp, bin sizes

Read more

Summary

Introduction

Identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles. These regions are divided into two categories, narrow peaks and broad domains, and different algorithms are used to identify each one. Datasets that span these two categories are often analyzed with a single program for peak calling combined with an ad hoc method for domains. ChIP-seq analysis algorithms have specialized in identifying one of two types of enrichment: broad domains (i.e. histone modifications that cover entire gene bodies) or narrow peaks (i.e. a transcription factor bound to an enhancer). A program that accurately identifies both broad domains and narrow peaks simultaneously would greatly simplify these analyses

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.