Abstract

Large whole-genome sequencing projects have provided access to much rare variation in human populations, which is highly informative about population structure and recent demography. Here, we show how the age of rare variants can be estimated from patterns of haplotype sharing and how these ages can be related to historical relationships between populations. We investigate the distribution of the age of variants occurring exactly twice ( variants) in a worldwide sample sequenced by the 1000 Genomes Project, revealing enormous variation across populations. The median age of haplotypes carrying variants is 50 to 160 generations across populations within Europe or Asia, and 170 to 320 generations within Africa. Haplotypes shared between continents are much older with median ages for haplotypes shared between Europe and Asia ranging from 320 to 670 generations. The distribution of the ages of haplotypes is informative about their demography, revealing recent bottlenecks, ancient splits, and more modern connections between populations. We see the effect of selection in the observation that functional variants are significantly younger than nonfunctional variants of the same frequency. This approach is relatively insensitive to mutation rate and complements other nonparametric methods for demographic inference.

Highlights

  • The recent availability of large numbers of fully sequenced human genomes has allowed, for the first time, detailed investigation of rare genetic variants

  • We investigate the distribution of these maximum likelihood estimate (MLE) for different classes of f2 variants, for example those shared within or between specific populations

  • We described an approach to estimate the age of f2 haplotypes, without making any prior assumptions about population structure or history

Read more

Summary

Introduction

The recent availability of large numbers of fully sequenced human genomes has allowed, for the first time, detailed investigation of rare genetic variants. These are highly differentiated between populations [1,2], may make an important contribution to genetic susceptibility to disease [3,4,5,6,7], and provide information about both demographic history, and fine-scale population structure [8,9]. We describe an alternative approach which uses the fact that the lengths of shared haplotypes around variants are informative about their ages [13,14,15]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call