Abstract

The relationship between the underlying contact network over which a pathogen spreads and the pathogen phylogenetic trees that are obtained presents an opportunity to use sequence data to learn about contact networks that are difficult to study empirically. However, this relationship is not explicitly known and is usually studied in simulations, often with the simplifying assumption that the contact network is static in time, though human contact networks are dynamic. We simulate pathogen phylogenetic trees on dynamic Erdős-Renyi random networks and on two dynamic networks with skewed degree distribution, of which one is additionally clustered. We use tree shape features to explore how adding dynamics changes the relationships between the overall network structure and phylogenies. Our tree features include the number of small substructures (cherries, pitchforks) in the trees, measures of tree imbalance (Sackin index, Colless index), features derived from network science (diameter, closeness), as well as features using the internal branch lengths from the tip to the root. Using principal component analysis we find that the network dynamics influence the shapes of phylogenies, as does the network type. We also compare dynamic and time-integrated static networks. We find, in particular, that static network models like the widely used Barabasi-Albert model can be poor approximations for dynamic networks. We explore the effects of mis-specifying the network on the performance of classifiers trained identify the transmission rate (using supervised learning methods). We find that both mis-specification of the underlying network and its parameters (mean degree, turnover rate) have a strong adverse effect on the ability to estimate the transmission parameter. We illustrate these results by classifying HIV trees with a classifier that we trained on simulated trees from different networks, infection rates and turnover rates. Our results point to the importance of correctly estimating and modelling contact networks with dynamics when using phylodynamic tools to estimate epidemiological parameters.

Highlights

  • Understanding whether and how the transmission patterns of a pathogen are revealed by branching patterns in pathogen phylogenetic trees remains a challenging research question

  • The host contact network is often difficult to study, in particular as it evolves in time

  • We study the tree features with principal component analysis and with supervised learning methods, and find that network dynamics and network type can strongly influence the shape of phylogenetic trees

Read more

Summary

Introduction

Understanding whether and how the transmission patterns of a pathogen are revealed by branching patterns in pathogen phylogenetic trees remains a challenging research question. Alongside the stochastic diversification of the pathogen on the short time scales of an infectious disease outbreak, branching patterns in the pathogen’s phylogenetic tree depend strongly on the underlying transmission pattern [1] and the host contact structure, as these shape the pathogen’s reproductive opportunities. The topology of the host contact network plays a crucial role in setting the epidemic threshold, the epidemic size and the most effective interventions. Network properties play a role in determining which individuals are at high risk of infection. Modellers seek to inform simulated networks with individual-level data from real populations. Individuals may not wish to report contacts to public health practitioners

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call