Abstract
Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3–6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor–DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.
Highlights
In vivo binding of regulatory factors shields bound DNA elements from nuclease attack, giving rise to protected single-nucleotide-resolution DNA ‘footprints’
To identify DNase I footprints genome-wide, we developed a computational approach that incorporates both chromatin architecture and exhaustively enumerated empirical DNase I sequence preferences to determine expected per-nucleotide cleavage rates across the genome, and to derive, for each biosample, a statistical model for testing whether its observed cleavage rates at individual nucleotides deviated significantly from expectation (Extended Data Fig. 1a–g, Supplementary Methods)
Because transcription factors (TFs) engagement creates subtle alterations in DNA shape and protects underlying phosphate bonds from nuclease attack via steric hindrance[6], we investigated to what extent fluctuations in corrected DNase I cleavage rates within individual consensus footprints accurately reflected the topology of the TF–DNA interface
Summary
Morphology of TF binding events within individual regulatory elements could be used to gain insight into the mechanistic basis of TF cooperativity. To quantify global footprint spacing patterns, we first binned each DHS by its average accessibility across all biosamples (as footprint discovery depends on total DNase I cleavage; Extended Data Fig. 1b), and for each bin we computed the mean number of footprints present per element and their relative edge-to-edge spacing. Within DHSs, footprints exhibited average edge-to-edge spacing of about 21 bp (middle 50%, 12–35 bp) (Fig. 3c, bottom) Together, these results are compatible with the observed lack of evolutionary constraint on the spacing and orientation[29,30,31,32,33] of TF motifs and strongly suggest that steady-state regulatory DNA accessibility is maintained by independent but synergistic TF binding modes (Fig. 3d).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.