Abstract

Abstract Phylogenetic comparative models (PCMs) have been used to study macroevolutionary patterns, to characterize adaptive phenotypic landscapes, to quantify rates of evolution, to measure trait heritability, and to test various evolutionary hypotheses. A major obstacle to applying these models has been the complexity of evaluating their likelihood function. Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post‐order tree traversal, also known as pruning. Despite this progress, inferring complex multi‐trait PCMs on large trees remains a time‐intensive task. Here, we study parallelizing the pruning algorithm as a generic technique for speeding‐up PCM‐inference. We implement several parallel traversal algorithms in the form of a generic C++ library for Serial and Parallel LIneage Traversal of Trees (SPLITT). Based on SPLITT, we provide examples of parallel likelihood evaluation for several popular PCMs, ranging from a single‐trait Brownian motion model to complex multi‐trait Ornstein‐Uhlenbeck and mixed Gaussian phylogenetic models. Using the phylogenetic Ornstein–Uhlenbeck mixed model (POUMM) as a showcase, we run benchmarks on up to 24 CPU cores, reporting up to an order of magnitude parallel speed‐up for the likelihood calculation on simulated balanced and unbalanced trees of up to 100,000 tips with up to 16 traits. Noticing that the parallel speed‐up depends on multiple factors, the SPLITT library is capable to automatically select the fastest traversal strategy for a given hardware, tree‐topology, and data. Combining SPLITT likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian POUMM inference on a tree of 10,000 tips can be reduced from several days to less than an hour. We conclude that parallel pruning effectively accelerates the likelihood calculation and, thus, the statistical inference of Gaussian phylogenetic models. For time‐intensive Bayesian inferences, we recommend combining this technique with adaptive Metropolis sampling. Beyond Gaussian models, the parallel tree traversal can be applied to numerous other models, including discrete trait and birth–death population dynamics models. Currently, SPLITT supports multi‐core shared memory architectures, but can be extended to distributed memory architectures as well as graphical processing units.

Highlights

  • Phylogenetic comparative models (PCMs) have been used for studying the evolution of various biological species, ranging from micro-­ organisms to animals and plants

  • Recent works have shown that for many PCMs, the likelihood can be obtained in time proportional to the size of the tree based on post-order tree traversal, known as pruning

  • Combining Serial and Parallel LIneage Traversal of Trees (SPLITT) likelihood calculation with adaptive Metropolis sampling on real data, we show that the time for Bayesian phylogenetic Ornstein–Uhlenbeck mixed model (POUMM) inference on a tree of 10,000 tips can be reduced from several days to less than an hour

Read more

Summary

| INTRODUCTION

Phylogenetic comparative models (PCMs) have been used for studying the evolution of various biological species, ranging from micro-­ organisms to animals and plants. The inherent complexity of these models is posing new challenges in terms of parameter inference and model selection In their effort to speed-­up PCM inference, recent works have shown that, for a broad family of PCMs, the likelihood of an observed phylogenetic tree and data conditioned on the model parameters can be computed in time proportional to the size of the tree (FitzJohn, 2012; Goolsby, Bruggeman, & An′e, 2016; Ho & Ané, 2014; Mitov, Bartoszek, Asimomitis, & Stadler, 2018). We showcase that our parallel pruning algorithm coupled with adaptive Metropolis samplers dramatically reduces the time for Bayesian analysis of trees with thousands of tips

| MATERIALS AND METHODS
Findings
| DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call