Abstract
Widely used models in genetics include the Wright-Fisher diffusion and its moment dual, Kingman's coalescent. Each has a multilocus extension but under neither extension is the sampling distribution available in closed-form, and their computation is extremely difficult. In this paper we derive two new multilocus population genetic models, one a diffusion and the other a coalescent process, which are much simpler than the standard models, but which capture their key properties for large recombination rates. The diffusion model is based on a central limit theorem for density dependent population processes, and we show that the sampling distribution is a linear combination of moments of Gaussian distributions and hence available in closed-form. The coalescent process is based on a probabilistic coupling of the ancestral recombination graph to a simpler genealogical process which exposes the leading dynamics of the former. We further demonstrate that when we consider the sampling distribution as an asymptotic expansion in inverse powers of the recombination parameter, the sampling distributions of the new models agree with the standard ones up to the first two orders.
Highlights
The basis of many important problems in genetics is to find an expression for a sampling distribution or likelihood
In this paper we show that, roughly speaking, in order to recover the sampling distribution up to O(ρ−1) we need consider only the following type of exceptional event: a coalescence occurs more recently than time U in the ancestral recombination graph (ARG), and the coalescence is between two lineages each of which is ancestral to both of the two loci
In this paper we will suppose a finite-alleles model of mutation such that a mutation to an allele i in type space EA = [K], K ∈ N, takes it to allele k ∈ [K] with probability PiAk, with EB = [L] and PjBl, j, l ∈ [L] defined analogously. (As we discover below, the mutation model is not important and we could pose something more complicated with little extra effort.) The probability of a recombination between the two loci per haplotype per generation is denoted by r, and we assume that ρβ = 2N βr is fixed as N → ∞, for some fixed β ∈
Summary
The basis of many important problems in genetics is to find an expression for a sampling distribution or likelihood. Inter-locus recombination quickly makes such models intractable; for neither the Wright-Fisher diffusion with recombination nor the coalescent with recombination—or ancestral recombination graph (ARG)—is it possible to obtain a closed-form expression for the sampling distribution. In this paper we show that, roughly speaking, in order to recover the sampling distribution up to O(ρ−1) we need consider only the following type of exceptional event: a coalescence occurs more recently than time U in the ARG, and the coalescence is between two lineages each of which is ancestral to both of the two loci This observation enables us to define a simple coalescent process which allows for at most one of these events but is otherwise very similar to the easy limiting process corresponding to ρ = ∞.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have