Abstract

BackgroundAdvances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Here we investigate TFBS variability by combining transcription factor binding maps generated by ENCODE, modENCODE, our previously published data and other sources with genomic variation data for human individuals and Drosophila isogenic lines.ResultsWe introduce a metric of TFBS variability that takes into account changes in motif match associated with mutation and makes it possible to investigate TFBS functional constraints instance-by-instance as well as in sets that share common biological properties. We also take advantage of the emerging per-individual transcription factor binding data to show evidence that TFBS mutations, particularly at evolutionarily conserved sites, can be efficiently buffered to ensure coherent levels of transcription factor binding.ConclusionsOur analyses provide insights into the relationship between individual and interspecies variation and show evidence for the functional buffering of TFBS mutations in both humans and flies. In a broad perspective, these results demonstrate the potential of combining functional genomics and population genetics approaches for understanding gene regulation.

Highlights

  • Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision

  • For these TFs, 24 to 28% of the bound sites overlapped with single-nucleotide polymorphism (SNP) identified by the Drosophila Genetic Reference Panel (DGRP) [22] in 162 isogenic lines of Drosophila melanogaster

  • We conclude that the quality and genetic diversity of the DGRP make it suitable for global analyses of TFBS variation and these data are unlikely to elicit a prohibitive bias

Read more

Summary

Introduction

Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Evolutionary analyses across species have proven to be a powerful approach in elucidating the functional constraints of DNA elements, in particular protein-coding genes, but are less interpretable in the context of CRM architecture [6,7]. In part, this is due to the fact that CRMs often have a ‘modular’, rather than ‘base-by-base’, conservation that may escape detection by conventional alignment-based approaches [8]. Even at the level of individual TFBSs, differences in sequence may be hard to interpret - as such differences, for example, may reflect evolutionary ‘fine-tuning’ to species-specific factors to preserve uniform outputs rather than signifying a lack of functional constraint [6,10,11,12]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call