Abstract

Widely used models in genetics include the Wright-Fisher diffusion and its moment dual, Kingman's coalescent. Each has a multilocus extension but under neither extension is the sampling distribution available in closed-form, and their computation is extremely difficult. In this paper we derive two new multilocus population genetic models, one a diffusion and the other a coalescent process, which are much simpler than the standard models, but which capture their key properties for large recombination rates. The diffusion model is based on a central limit theorem for density dependent population processes, and we show that the sampling distribution is a linear combination of moments of Gaussian distributions and hence available in closed-form. The coalescent process is based on a probabilistic coupling of the ancestral recombination graph to a simpler genealogical process which exposes the leading dynamics of the former. We further demonstrate that when we consider the sampling distribution as an asymptotic expansion in inverse powers of the recombination parameter, the sampling distributions of the new models agree with the standard ones up to the first two orders.

Highlights

  • The basis of many important problems in genetics is to find an expression for a sampling distribution or likelihood

  • In this paper we show that, roughly speaking, in order to recover the sampling distribution up to O(ρ−1) we need consider only the following type of exceptional event: a coalescence occurs more recently than time U in the ancestral recombination graph (ARG), and the coalescence is between two lineages each of which is ancestral to both of the two loci

  • In this paper we will suppose a finite-alleles model of mutation such that a mutation to an allele i in type space EA = [K], K ∈ N, takes it to allele k ∈ [K] with probability PiAk, with EB = [L] and PjBl, j, l ∈ [L] defined analogously. (As we discover below, the mutation model is not important and we could pose something more complicated with little extra effort.) The probability of a recombination between the two loci per haplotype per generation is denoted by r, and we assume that ρβ = 2N βr is fixed as N → ∞, for some fixed β ∈

Read more

Summary

Introduction

The basis of many important problems in genetics is to find an expression for a sampling distribution or likelihood. Inter-locus recombination quickly makes such models intractable; for neither the Wright-Fisher diffusion with recombination nor the coalescent with recombination—or ancestral recombination graph (ARG)—is it possible to obtain a closed-form expression for the sampling distribution. In this paper we show that, roughly speaking, in order to recover the sampling distribution up to O(ρ−1) we need consider only the following type of exceptional event: a coalescence occurs more recently than time U in the ARG, and the coalescence is between two lineages each of which is ancestral to both of the two loci This observation enables us to define a simple coalescent process which allows for at most one of these events but is otherwise very similar to the easy limiting process corresponding to ρ = ∞.

Notation and previous results
Diffusion model
Neutral Moran model
Gaussian diffusion limit of fluctuations in linkage disequilibrium
Stationary distribution
Sampling distribution
Accuracy of the diffusion process
A coupling argument
1: Cumulative distribution
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call