Abstract

BackgroundWith only 2 % of the human genome consisting of protein coding genes, functionality across the rest of the genome has been the subject of much debate. This has gained further impetus in recent years due to a rapidly growing catalogue of genomic elements, based primarily on biochemical signatures (e.g. the ENCODE project). While the assessment of functionality is a complex task, the presence of selection acting on a genomic region is a strong indicator of importance. In this study, we apply population genetic methods to investigate signals overlaying several classes of regulatory elements.ResultsWe disentangle signals of purifying selection acting directly on regulatory elements from the confounding factors of demography and purifying selection linked to e.g. nearby protein coding regions. We confirm the importance of regulatory regions proximal to coding sequence, while also finding differential levels of selection at distal regions. We note differences in purifying selection among transcription factor families. Signals of constraint at some genomic classes were also strongly dependent on their physical location relative to coding sequence. In addition, levels of selection efficacy across genomic classes differed between African and non-African populations.ConclusionsIn order to assign a valid signal of selection to a particular class of genomic sequence, we show that it is crucial to isolate the signal by accounting for the effects of demography and linked-purifying selection. Our study highlights the intricate interplay of factors affecting signals of selection on functional elements.

Highlights

  • With only 2 % of the human genome consisting of protein coding genes, functionality across the rest of the genome has been the subject of much debate

  • Signs of demography When examining the results for the non-annotated class, we observed the well-known reduced diversity in nonAfricans, which has been attributed to the Out-of-Africa bottleneck [27]

  • We searched for signals of selection primarily among predicted regulatory elements and regions bound by transcription factor (TF), and used two different approaches in order to separate the demographic and selection signals: (i) by comparing results from different population groups, known to have experienced some differences in their more recent demographic histories; and (ii) through the use of a “selection-neutral” reference, comprised of non-annotated regions of the genome, which allowed us to control for the effects of demography on each population group

Read more

Summary

Introduction

With only 2 % of the human genome consisting of protein coding genes, functionality across the rest of the genome has been the subject of much debate. Conserved sequences exist due to a lowered substitution rate; caused by the removal of deleterious mutations from regions subject to purifying selection by virtue of their biological importance These methods have been used to provide an estimate for the proportion of functional sites in mammalian genomes [10]; estimated at ~5%. A majority of highly conserved regions detected by comparative genomics investigations have yet to be verified experimentally, through biochemical or functional assays [12] This need is illustrated by the ultra-conserved elements [13], which are genomic regions longer than 200 bases that maintain 100% identity in human, rat and mouse genomes. Hernandez et al [25] found selection signals in conserved non-coding regions and noted that the proximity of these regions to exons may have been responsible for these observations

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.