Abstract

Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.

Highlights

  • Microbiota research today embodies the data-rich nature of modern biology

  • Each balance yÃi is associated with a single internal node i of a phylogenetic tree with the D taxa as leaves

  • The unit-length ensures that the variance of Phylogenetic ILR (PhILR) balances has a consistent scale, unlike the variance of log-ratios originally proposed by Aitchison (Aitchison, 1986) as a measure of association, in which it can be unclear what constitutes a large or small variance (Friedman and Alm, 2012)

Read more

Summary

Introduction

Microbiota research today embodies the data-rich nature of modern biology. Advances in highthroughput DNA sequencing allow for rapid and affordable surveys of thousands of bacterial taxa across hundreds of samples (Caporaso et al, 2011). The exploding availability of sequencing data has poised microbiota research to advance our understanding of fields as diverse as ecology, evolution, medicine, and agriculture (Waldor et al, 2015). Considerable effort focuses on interrogating microbiota datasets to identify relationships between bacterial taxa, as well as between microbes and their environment. It is appreciated that the relative nature of microbial abundance data in microbiota studies can lead to spurious statistical analyses (Jackson, 1997; Friedman and Alm, 2012; Aitchison, 1986; Lovell et al, 2011; Gloor et al, 2016a; Britanova et al, 2014; Li, 2015; Tsilimigras and Fodor, 2016).

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call