Abstract

Homologous recombination is a central feature of bacterial evolution, yet it confounds traditional phylogenetic methods. While a number of methods specific to bacterial evolution have been developed, none of these permit joint inference of a bacterial recombination graph and associated parameters. In this article, we present a new method which addresses this shortcoming. Our method uses a novel Markov chain Monte Carlo algorithm to perform phylogenetic inference under the ClonalOrigin model. We demonstrate the utility of our method by applying it to ribosomal multilocus sequence typing data sequenced from pathogenic and nonpathogenic Escherichia coli serotype O157 and O26 isolates collected in rural New Zealand. The method is implemented as an open source BEAST 2 package, Bacter, which is available via the project web page at http://tgvaughan.github.io/bacter.

Highlights

  • RECOMBINATION plays a crucial role in the molecular evolution of many bacteria, in spite of the clonal nature of bacterial reproduction

  • These models acknowledge that the asymmetry present in the bacterial context allows for the definition of a precisely defined clonal genealogy—the clonal frame (CF)—which represents the true reproductive genealogy of a given set of bacterial samples, and the ancestry of the majority of their genetic material

  • It does allow the Markov chain Monte Carlo (MCMC) algorithm implemented in the associated ClonalFrame software package to jointly infer the bacterial CF, conversion rate, and tractlength parameters; neatly avoiding the branch-length bias described by Schierup and Hein (2000)

Read more

Summary

Introduction

RECOMBINATION plays a crucial role in the molecular evolution of many bacteria, in spite of the clonal nature of bacterial reproduction. In the first article, Didelot and Falush (2007) presented a method for performing inference under a model of molecular evolution, which, in combination with a standard substitution model, includes effects similar to those resulting from gene conversion; instantaneous events that simultaneously produce character-state changes at multiple sites within a randomly positioned conversion tract. This model does not consider the origin of these changes: it dispenses entirely with the ARG and can be considered a rather peculiar substitution model applied to evolution of sequences down the CF. It does allow the Markov chain Monte Carlo (MCMC) algorithm implemented in the associated ClonalFrame software package to jointly infer the bacterial CF, conversion rate, and tractlength parameters; neatly avoiding the branch-length bias described by Schierup and Hein (2000). Didelot and Wilson (2015) introduced a maximum likelihood method for performing inference under the same model, making it possible to infer CFs from whole bacterial genomes as opposed to the short sequences that the earlier Bayesian method could handle

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call