Abstract
BackgroundDifferences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene.ResultsHundreds of TFBS-affecting INDELs (TFBS-ID) show a differential frequency between human populations, suggesting a role of natural selection in the spread of such variant INDELs. A comparison with a dataset of known human genomic regions under natural selection allowed us to identify several cases of TFBS-ID likely involved in populational adaptations. Ontology analyses on the differential TFBS-ID further indicated several biological processes under natural selection in different populations.ConclusionTogether, our results strongly suggest that INDELs have an important role in modulating gene expression patterns in humans. The dataset we make available, together with other data reporting variability at both regulatory and coding regions of genes, represent a powerful tool for studies aiming to better understand the evolution of gene regulatory networks in humans.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1744-5) contains supplementary material, which is available to authorized users.
Highlights
Differences in gene expression have a significant role in the diversity of phenotypes in humans
To build a catalogue of transcription factor-binding sites (TFBS)-ID, we first indexed all TFBS identified by the Encyclopedia of DNA Elements (ENCODE) project in the human reference genome
Data from the 1000 Genomes project regarding the position of Insertion / Deletion (INDEL) in the reference genome was compared to the position of TFBS and those cases in which an INDEL overlapped with a TFBS were selected
Summary
Differences in gene expression have a significant role in the diversity of phenotypes in humans. We integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene. Transcription factor binding sites (TFBS) have recently been studied both in humans and other animals [8,9,10]. Vernot et al [10] have found hundreds of variations that are adaptive These studies have shed some light on the evolutionary forces acting on TFBS and other regulatory elements, several issues remain poorly explored or even
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.