Abstract

Nonparametric coalescent-based models are often employed to infer past population dynamics over time. Several of these models, such as the skyride and skygrid models, are equipped with a block-updating Markov chain Monte Carlo sampling scheme to efficiently estimate model parameters. The advent of powerful computational hardware along with the use of high-performance libraries for statistical phylogenetics has, however, made the development of alternative estimation methods feasible. We here present the implementation and performance assessment of a Hamiltonian Monte Carlo gradient-based sampler to infer the parameters of the skygrid model. The skygrid is a popular and flexible coalescent-based model for estimating population dynamics over time and is available in BEAST 1.10.5, a widely-used software package for Bayesian pylogenetic and phylodynamic analysis. Taking into account the increased computational cost of gradient evaluation, we report substantial increases in effective sample size per time unit compared to the established block-updating sampler. We expect gradient-based samplers to assume an increasingly important role for different classes of parameters typically estimated in Bayesian phylogenetic and phylodynamic analyses.

Highlights

  • Inference of effective population size over time from a sample of molecular sequences is a key aspect of many phylodynamics studies

  • In a Bayesian framework, coalescent models function as prior distributions for phylogenetic trees and, in conjunction with observed sequence data likelihoods based on continuous-time Markov models for molecular character evolution on trees9, they enable the estimation of effective population size directly from molecular sequence data

  • Ebola virus, and rice yellow mottle virus (RYMV) data sets, we observe that Hamiltonian Monte Carlo (HMC) consistently outperforms the standard skygrid block-updating MCMC (BUMCMC) transition kernel in terms of more efficiently generating effectively independent samples of skygrid model parameters

Read more

Summary

Introduction

Inference of effective population size over time from a sample of molecular sequences is a key aspect of many phylodynamics studies. Inference methods typically employ coalescent models that connect population dynamics to the shape of a genealogy relating such a sample. Flexible nonparametric coalescent models have become widely used. Flexible nonparametric coalescent models have become widely used2–8 These models typically posit that the effective population size as a function of time ( referred to as the “demographic function”) assumes a piecewise constant form, thereby avoiding restrictive a priori assumptions about the specific parametric form of the demographic function. In a Bayesian framework, coalescent models function as prior distributions for phylogenetic trees and, in conjunction with observed sequence data likelihoods based on continuous-time Markov models for molecular character evolution on trees, they enable the estimation of effective population size directly from molecular sequence data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call