Exact Likelihood Calculation under the Infinite Sites Model

Muhammad Faisal,Claus Vogl,Andreas Futschik

doi:10.3390/computation3040701

Abstract

A key parameter in population genetics is the scaled mutation rate θ = 4 N μ , where N is the effective haploid population size and μ is the mutation rate per haplotype per generation. While exact likelihood inference is notoriously difficult in population genetics, we propose a novel approach to compute a first order accurate likelihood of θ that is based on dynamic programming under the infinite sites model without recombination. The parameter θ may be either constant, i.e., time-independent, or time-dependent, which allows for changes of demography and deviations from neutral equilibrium. For time-independent θ, the performance is compared to the approach in Griffiths and Tavaré’s work “Simulating Probability Distributions in the Coalescent” (Theor. Popul. Biol. 1994, 46, 131–159) that is based on importance sampling and implemented in the “genetree” program. Roughly, the proposed method is computationally fast when n × θ < 100 , where n is the sample size. For time-dependent θ ( t ) , we analyze a simple demographic model with a single change in θ ( t ) . In this case, the ancestral and current θ need to be estimated, as well as the time of change. To our knowledge, this is the first accurate computation of a likelihood in the infinite sites model with non-equilibrium demography.

Highlights

The infinite sites model is among the simplest models in population genetics
With all mutations occurring at different positions, modeling of genetic variation becomes both mathematically and computationally easier [1]
We compared our dynamic programming (DP) method with the genetree method proposed by Griffiths and Tavaré [7], which is based on importance sampling

Summary

Introduction

The infinite sites model is among the simplest models in population genetics. Polymorphism is assumed to arise by single mutations of unique sites along a stretch of DNA. Population sizes may vary with time, and scaled mutation rate will vary This leads to a time dependent parameter θ(t) = 4N (t)μ, and the distribution of the data will deviate from that under neutral equilibrium. Wu [8] showed that a dynamic programming algorithm can speed up summation over all possible genealogies to make exact calculations feasible for larger datasets. None of these approaches, allow for inference in the presence of time-dependent variation of population sizes or mutation rates.

Dynamic Programming Algorithms for Estimating θ

Basic Probability Model

Efficient Likelihood Computation

Example

Calculating the Likelihood for Time-Independent θ

Calculating the Likelihood for Time-Dependent θ

4: Termination

Simulations

Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computation	Publication Date: Dec 11, 2015
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Exact Likelihood Calculation under the Infinite Sites Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation

Lead the way for us

Similar Papers

Decision letter: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Laurent Duret ... George H Perry
-
Laurent Duret, et. al.Laurent Duret ... George H Perry
22 Aug 2022
22 Aug 2022

Editor's evaluation: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Philipp W Messer
-
Philipp W MesserPhilipp W Messer
22 Aug 2022
22 Aug 2022

Author response: Limited role of generation time changes in driving the evolution of the mutation spectrum in humans
Ziyue Gao ... Yulin Zhang
-
Ziyue Gao, et. al.Ziyue Gao ... Yulin Zhang
16 Jan 2023
16 Jan 2023

Quantifying the Variation in the Effective Population Size Within a Genome
Toni I Gossmann ... Adam Eyre-Walker
Genetics | VOL. 189
Toni I Gossmann, et. al.Toni I Gossmann ... Adam Eyre-Walker
01 Dec 2011
Genetics | VOL. 189

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exact Likelihood Calculation under the Infinite Sites Model

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation