Abstract

Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This paper develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction with constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50000 or greater in contrast to 10000, and the estimates of the recent homogenization events are agree with the “Out of Africa” hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. The results show that significant discrepancies can exist between the estimates.

Highlights

  • Hitchhiking and severe bottleneck effects have similar signatures on the population genome by reseting the molecular clock

  • In this process the ancestral lineages of the sample are traced until time t~T=(gN‘) and mutations are added on the branches of the genealogy as independent Poisson processes with rates equal to h=2, h:2N‘m. In the infinite-sites model, each mutation occurs at a nucleotide site that has not been mutated before

  • The method uses the number of polymorphic sites instead of full polymorphism data in samples of DNA sequences, and it is constrained by the assumptions of the constant size Wright-Fisher reproduction model and the infinite-sites model

Read more

Summary

Introduction

Hitchhiking and severe bottleneck effects have similar signatures on the population genome by reseting the molecular clock. Their impacts at the genome level are on different scales. The hitchhiking effect has a local signature because recombination breaks down linkage disequilibrium between sites on the genome; the locus completely linked to a site under a positive selection becomes homogenous in the population [1]. After relatively quick recovery of a population from a severe bottleneck, it becomes genome-wide homogeneous. Identifying and differentiating recent such events at a single locus can be challenging because both processes have similar signature on the genetic diversity at single locus. Multi-locus DNA sequence data can be a powerful source for this purpose

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call