Abstract

We present MUSIC, a signal processing approach for identification of enriched regions in ChIP-Seq data, available at http://www.music.gersteinlab.org . MUSIC first filters the ChIP-Seq read-depth signal for systematic noise from non-uniform mappability, which fragments enriched regions. Then it performs a multiscale decomposition, using median filtering, identifying enriched regions at multiple length scales. This is useful given the wide range of scales probed in ChIP-Seq assays. MUSIC performs favorably in terms of accuracy and reproducibility compared with other methods. In particular, analysis of RNA polymerase II data reveals a clear distinction between the stalled and elongating forms of the polymerase.

Highlights

  • With the recent advancements in sequencing technologies, chromatin immunoprecipitation (ChIP)-based enrichment of DNA sequences followed by sequencing (ChIP-Seq) [1,2] has become the mainstream experimental method for genome-wide measurement of the locations of DNA binding proteins like transcription factors (TFs) and posttranslational modifications of histone proteins, or histone modifications (HMs) [3,4]

  • MUSIC algorithm Figure 1 shows a flowchart for MUSIC

  • Mappability is an important aspect of enriched region (ER) identification from next-generation sequencing data, especially for identifying broad domains of enrichment since read depth (RD) profiles are highly correlated with the mappability map

Read more

Summary

Introduction

With the recent advancements in sequencing technologies, chromatin immunoprecipitation (ChIP)-based enrichment of DNA sequences followed by sequencing (ChIP-Seq) [1,2] has become the mainstream experimental method for genome-wide measurement of the locations of DNA binding proteins like transcription factors (TFs) and posttranslational modifications of histone proteins, or histone modifications (HMs) [3,4]. Consortium projects such as ENCODE [5] and the Roadmap Epigenomics Project [6] generated ChIP-Seq datasets to map the chromatin states of many cell lines and tissues [7]. Development of efficient computational methods for identification and characterization of the broad ERs is necessary for understanding the regulatory effects of HMs and diffuse DNA binding proteins on gene expression as increasing evidence indicates that these epigenetic factors are major driving factors in pluripotency [10] and of disease manifestation, such as cancerogenesis [11,12,13,14,15]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.