Abstract

Understanding the processes and conditions under which populations diverge to give rise to distinct species is a central question in evolutionary biology. Since recently diverged populations have high levels of shared polymorphisms, it is challenging to distinguish between recent divergence with no (or very low) inter-population gene flow and older splitting events with subsequent gene flow. Recently published methods to infer speciation parameters under the isolation-migration framework are based on summarizing polymorphism data at multiple loci in two species using the joint site-frequency spectrum (JSFS). We have developed two improvements of these methods based on a more extensive use of the JSFS classes of polymorphisms for species with high intra-locus recombination rates. First, using a likelihood based method, we demonstrate that taking into account low-frequency polymorphisms shared between species significantly improves the joint estimation of the divergence time and gene flow between species. Second, we introduce a local linear regression algorithm that considerably reduces the computational time and allows for the estimation of unequal rates of gene flow between species. We also investigate which summary statistics from the JSFS allow the greatest estimation accuracy for divergence time and migration rates for low (around 10) and high (around 100) numbers of loci. Focusing on cases with low numbers of loci and high intra-locus recombination rates we show that our methods for the estimation of divergence time and migration rates are more precise than existing approaches.

Highlights

  • Understanding speciation processes is crucial in numerous fields including conservation biology, ecology, host-parasite co-evolution and human evolution [1]

  • We demonstrate the benefit of using more than four statistics of the joint sitefrequency spectrum (JSFS) for estimating divergence time and migration rates

  • Following the approach pioneered by the authors of the MIMAR software, we developed methods to tackle two limitations of existing estimation procedures: the pervasive problem of intra-locus recombination and the often limited number of loci sequenced and individuals sampled

Read more

Summary

Introduction

Understanding speciation processes is crucial in numerous fields including conservation biology, ecology, host-parasite co-evolution and human evolution [1]. According to the ‘‘biological species concept’’, a species is defined as a group of interbreeding individuals that are reproductively isolated from other taxa [2] Under this framework, the study of the speciation process focuses on the conditions leading to the emergence of reproductive isolation [3]. A second scenario considers divergence with continuing gene flow between populations, for example when species ranges abut (parapatry) or overlap following secondary contact, allowing for introgression. The latter model has been suggested to describe speciation events between human populations and ape species or sub-species [4], Drosophila species [5], and the wild tomato species Solanum peruvianum and S. chilense [6]. To reliably use these variances for parameter estimation, data sets with large numbers of sequences are needed, which is a practical constraint in studies of many nonmodel organisms [8]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call