Abstract

Genomics is increasingly being used to investigate disease outbreaks, but an important question remains unanswered—how well do genomic data capture known transmission events, particularly for pathogens with long carriage periods or large within-host population sizes? Here we present a novel Bayesian approach to reconstruct densely sampled outbreaks from genomic data while considering within-host diversity. We infer a time-labeled phylogeny using Bayesian evolutionary analysis by sampling trees (BEAST), and then infer a transmission network via a Monte Carlo Markov chain. We find that under a realistic model of within-host evolution, reconstructions of simulated outbreaks contain substantial uncertainty even when genomic data reflect a high substitution rate. Reconstruction of a real-world tuberculosis outbreak displayed similar uncertainty, although the correct source case and several clusters of epidemiologically linked cases were identified. We conclude that genomics cannot wholly replace traditional epidemiology but that Bayesian reconstructions derived from sequence data may form a useful starting point for a genomic epidemiology investigation.

Highlights

  • Infectious disease outbreak investigation typically involves both field work – interviews that establish links between patients and/or exposures to a source of infection – and molecular epidemiology, in which laboratory typing of pathogen isolates is used to identify related cases

  • To assess the impact of within-host diversity on the inference of transmission events from a phylogeny, we simulated a transmission tree – the set of specific person-to-person transmission events within an outbreak – and genealogies arising from this tree under two scenarios of effective population size

  • We present a Bayesian inference method for reconstructing transmission events in a densely sampled outbreak using time-labelled genomic data

Read more

Summary

Introduction

Infectious disease outbreak investigation typically involves both field work – interviews that establish links between patients and/or exposures to a source of infection – and molecular epidemiology, in which laboratory typing of pathogen isolates is used to identify related cases Data from both streams are considered together to reconstruct the outbreak, identifying its origins and pathways of onward transmission. Given the short timescales over which outbreaks typically occur, only a small number of single nucleotide changes are expected between outbreak isolates This diversity is not captured by traditional typing methods, but can be identified and leveraged for outbreak reconstruction using whole-genome approaches. Given the importance of factors such as branch length in inferring the underlying host contact network structure from a phylogeny [24], an inference method that incorporates sampling times and a molecular clock analysis is preferable to one using neighbour-joining, maximum parsimony, or other simplified tree-building algorithms

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call