Abstract

The distribution of allele frequencies of a large number of biallelic sites is known as “allele-frequency spectrum” or “site-frequency spectrum” (SFS). Without selection and in regions of relatively high recombination rates, sites may be assumed to be independently and identically distributed. With a beta equilibrium distribution of allelic proportions and binomial sampling, a beta–binomial compound likelihood for each site results. The likelihood of the data and the posterior distribution of two parameters, scaled mutation rate θ and mutation bias α, is investigated in the general case and for small scaled mutation rates θ. In the general case, an expectation–maximization (EM) algorithm is derived to obtain maximum likelihood estimates of both parameters. With an appropriate prior distribution, a Markov chain Monte Carlo sampler to integrate the posterior distribution is also derived. As far as I am aware, previous maximum likelihood or Bayesian estimators of θ, explicitly or implicitly assume small scaled mutation rates, i.e., θ≪1. For θ≪1, maximum-likelihood estimators are also derived for both parameters using a Taylor series expansion of the beta–binomial distribution. The estimator of θ is a variant of the Ewens–Watterson estimator and of the maximum likelihood estimator derived with the Poisson Random Field approach. With a conjugate prior distribution, marginal and conditional beta posterior distributions are also derived for both parameters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call