Abstract

Kingman’s coalescent process is a mathematical model of genealogy in which only pairwise common ancestry may occur. Inter-arrival times between successive coalescence events have a negative exponential distribution whose rate equals the combinatorial term ( n 2 ) where n denotes the number of lineages present in the genealogy. These two standard constraints of Kingman’s coalescent, obtained in the limit of a large population size, approximate the exact ancestral process of Wright-Fisher or Moran models under appropriate parameterization. Calculation of coalescence event probabilities with higher accuracy quantifies the dependence of sample and population sizes that adhere to Kingman’s coalescent process. The convention that probabilities of leading order N − 2 are negligible provided n ≪ N is examined at key stages of the mathematical derivation. Empirically, expected genealogical parity of the single-pair restricted Wright-Fisher haploid model exceeds 99% where n ≤ 1 2 N 3 ; similarly, per expected interval where n ≤ 1 2 N / 6 . The fractional cubic root criterion is practicable, since although it corresponds to perfect parity and to an extent confounds identifiability it also accords with manageable conditional probabilities of multi-coalescence.

Highlights

  • Kingman’s coalescent process is a mathematical model of ancestral lineages that inspired a paradigmatic era in population genetics [1,2,3]

  • Negligibility depends on terms of leading order N −2 or less that can be omitted from the process in the limit of a large population size

  • Derivations of alternative coalescent processes usually retain the conventional proportionality to N −2 ([21], Theorem 3.2 via Equation (5); [22], Theorem 2.1 via Equation (4)). These generalizations are in turn based on the partition structures of equivalence classes described in terms of sampling distributions not originally connected to genealogy [23,24,25]

Read more

Summary

Introduction

Kingman’s coalescent process is a mathematical model of ancestral lineages that inspired a paradigmatic era in population genetics [1,2,3]. Kingman’s coalescent process demonstrates the utility of this conventional approximation to the exact ancestral process [8]. Phylogenetic trees in general contain a coalescent process of ancestral lineages from the corresponding sub-population within each branch of the phylogeny. The ancestral process within the branches of a phylogeny are often modeled using Kingman’s coalescent [9] or theory of branching processes [10]. Statistical distribution theory of the Ewens’ sampling formula is derived in population genetics by superimposing unique event mutations on the genealogical structure of Kingman’s coalescent [11,12]

Coalescent Theory of Ancestral Processes
Coalescent Theory of Branching Processes
Zero Coalescence Events
Multiple Coalescence Events
Genealogical Topology and Expected Inter-Arrival Generations
Conditional Probabilities of Multi-Coalescence
Parity of the Kingman Coalescent
Linearization Errors
Parity Paradox
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call