Abstract
BackgroundKey regulators of developmental processes can be prioritized through integrated analysis of ChIP-Seq data of master transcriptional factors (TFs) such as Nanog and Oct4, active histone modifications (HMs) such as H3K4me3 and H3K27ac, and repressive HMs such as H3K27me3. Recent studies show that broad enrichment signals such as super-enhancers and broad H3K4me3 enrichment signals play more dominant roles than short enrichment signals of the master TFs and H3K4me3 in epigenetic regulatory mechanism. Besides the broad enrichment signals, up to ten thousands of short enrichment signals of these TFs and HMs exist in genome. Prioritization of these broad enrichment signals from ChIP-Seq data is a prerequisite for such integrated analysis.ResultsHere, we present a method named Clustering-Local-Unique-Enriched-Signals (CLUES), which uses an adaptive-size-windows strategy to identify enriched regions (ERs) and cluster them into broad enrichment signals. Tested on 62 ENCODE ChIP-Seq datasets of Ctcf and Nrsf, CLUES performs equally well as MACS2 regarding prioritization of ERs with the TF’s motif. Tested on 165 ENCODE ChIP-Seq datasets of H3K4me3, H3K27me3, and H3K36me3, CLUES performs better than existing algorithms on prioritizing broad enrichment signals implicating cell functions influenced by epigenetic regulatory mechanism in cells. Most importantly, CLUES helps to confirm several novel regulators of mouse ES cell self-renewal and pluripotency through integrated analysis of prioritized broad enrichment signals of H3K4me3, H3K27me3, Nanog and Oct4 with the support of a CRISPR/Cas9 negative selection genetic screen.ConclusionsCLUES holds promise for prioritizing broad enrichment signals from ChIP-Seq data. The download site for CLUES is https://github.com/Wuchao1984/CLUESv1.
Highlights
Mapping epigenomic modifications and chromatin regulator/transcription factor binding positions is critical for understanding human development and disease manifestation [1, 2]
Tested on 165 ENCODE ChIP-Seq datasets of H3K4me3, H3K27me3, and H3K36me3, CLUES performs better than existing algorithms on prioritizing broad enrichment signals implicating cell functions influenced by epigenetic regulatory mechanism in cells
We found that CLUES detects more reliable enriched regions (ERs) than MACS2, especially on low-enrichment datasets (16 of 20 pairs) at a default qvalue (0.05 for both CLUES and MACS2; Fig 2C and 2D and S3 Fig; see “Comparing the number of reliable-ERs identified by CLUES and MACS2 under different signal-to-noise (SNR) value” in Methods for more details)
Summary
Mapping epigenomic modifications and chromatin regulator/transcription factor binding positions is critical for understanding human development and disease manifestation [1, 2]. The discovery of broad enrichment signals such as broad H3K4me enrichment signals [17, 18], superenhancer elements [1] and bivalent chromatin domains [19] pushes us to integrate multiple ChIP-Seq data to explore epigenetic regulatory mechanism and identify potential key developmental regulators. Key regulators of developmental processes can be prioritized through integrated analysis of ChIP-Seq data of master transcriptional factors (TFs) such as Nanog and Oct, active histone modifications (HMs) such as H3K4me and H3K27ac, and repressive HMs such as H3K27me. Besides the broad enrichment signals, up to ten thousands of short enrichment signals of these TFs and HMs exist in genome Prioritization of these broad enrichment signals from ChIP-Seq data is a prerequisite for such integrated analysis
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have