Abstract

Multispecies coalescent (MSC) is the extension of the single-population coalescent model to multiple species. It integrates the phylogenetic process of species divergences and the population genetic process of coalescent, and provides a powerful framework for a number of inference problems using genomic sequence data from multiple species, including estimation of species divergence times and population sizes, estimation of species trees accommodating discordant gene trees, inference of cross-species gene flow and species delimitation. In this review, we introduce the major features of the MSC model, discuss full-likelihood and heuristic methods of species tree estimation and summarize recent methodological advances in inference of cross-species gene flow. We discuss the statistical and computational challenges in the field and research directions where breakthroughs may be likely in the next few years.

Highlights

  • Developed in the 1980s, the coalescent is a stochastic process that describes the genealogical history of a sample of DNA sequences taken from a population [1,2,3]

  • We describe the major features of the multispecies coalescent (MSC) model, and discuss its applications in two major areas: the estimation of the species phylogeny and the inference of cross-species gene flow

  • The joint density of gene trees and coalescent times is a product over the populations on the species tree, and as a result we focus on the contribution from one population

Read more

Summary

Introduction

Developed in the 1980s, the coalescent is a stochastic process that describes the genealogical history of a sample of DNA sequences taken from a population [1,2,3]. We consider the joint density of (G j,t j), the gene tree with the complete history of coalescence and introgression events at locus j, including the parental path taken by each sequence at each hybridization node. This is used in full-likelihood implementations of the MSci model. This joint density is very similar to that under the MSC without gene flow (eq 7), with the only modification that each time a sequence passes a hybridization node, there is a probability φ or 1 − φ depending on the parental path taken. In the model of figure 10, if sequences b and c reach species Y , it should be possible

DY ZE W
Heuristic methods for inferring gene flow
Method Full likelihood
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call