Abstract
Elucidating the transcriptional regulatory networks that underlie growth and development requires robust ways to define the complete set of transcription factor (TF) binding sites. Although TF-binding sites are known to be generally located within accessible chromatin regions (ACRs), pinpointing these DNA regulatory elements globally remains challenging. Current approaches primarily identify binding sites for a single TF (e.g. ChIP-seq), or globally detect ACRs but lack the resolution to consistently define TF-binding sites (e.g. DNAse-seq, ATAC-seq). To address this challenge, we developed MNase-defined cistrome-Occupancy Analysis (MOA-seq), a high-resolution (< 30 bp), high-throughput, and genome-wide strategy to globally identify putative TF-binding sites within ACRs. We used MOA-seq on developing maize ears as a proof of concept, able to define a cistrome of 145,000 MOA footprints (MFs). While a substantial majority (76%) of the known ATAC-seq ACRs intersected with the MFs, only a minority of MFs overlapped with the ATAC peaks, indicating that the majority of MFs were novel and not detected by ATAC-seq. MFs were associated with promoters and significantly enriched for TF-binding and long-range chromatin interaction sites, including for the well-characterized FASCIATED EAR4, KNOTTED1, and TEOSINTE BRANCHED1. Importantly, the MOA-seq strategy improved the spatial resolution of TF-binding prediction and allowed us to identify 215 motif families collectively distributed over more than 100,000 non-overlapping, putatively-occupied binding sites across the genome. Our study presents a simple, efficient, and high-resolution approach to identify putative TF footprints and binding motifs genome-wide, to ultimately define a native cistrome atlas.
Highlights
One of the fundamental drivers of phenotypic variation is the activation or repression of gene transcription
To define specific candidate transcription factor (TF)-binding sites within accessible chromatin regions, we developed MOA-seq to capture the putative footprints of native DNA-protein interactions
Following sequencing and read mapping to a reference genome, we plotted the density of aligned fragment midpoints to determine MOA footprints (MFs, average 29.5 bp) and used these to improve the spatial resolution of putative TF-binding event prediction (Fig 1, Step 4)
Summary
One of the fundamental drivers of phenotypic variation is the activation or repression of gene transcription. Determining where TFs bind genome-wide provides insights into transcriptional programs that are active across organs and environmental conditions, it allows for the identifcation of cis-elements and underlying sequence motifs [1,2]. ChIP requires a potent and epitope- or TF-specific antibody, and only a few antibodies generally qualify as such [4]. While ChIP-seq does reveal TF-binding sites genome-wide and in the native chromatin context, it does so for a single TF, which dramatically limits its applicability and scalability to characterize entire cistromes. A comprehensive understanding of all TF-binding sites for even a single organ would require the prior knowledge of which TFs are present and active, and thousands of ChIP-seq experiments performed under identical conditions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have