Abstract

BackgroundMicrobiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value for identifying complex features within such data.ResultsHere, we describe a novel hierarchical model for Bayesian inference of microbial communities (BioMiCo). The model takes abundance data derived from environmental DNA, and models the composition of each sample by a two-level hierarchy of mixture distributions constrained by Dirichlet priors. BioMiCo is supervised, using known features for samples and appropriate prior constraints to overcome the challenges posed by many variables, sparse data, and large numbers of rare species. The model is trained on a portion of the data, where it learns how assemblages of species are mixed to form communities and how assemblages are related to the known features of each sample. Training yields a model that can predict the features of new samples. We used BioMiCo to build models for three serially sampled datasets and tested their predictive accuracy across different time points. The first model was trained to predict both body site (hand, mouth, and gut) and individual human host. It was able to reliably distinguish these features across different time points. The second was trained on vaginal microbiomes to predict both the Nugent score and individual human host. We found that women having normal and elevated Nugent scores had distinct microbiome structures that persisted over time, with additional structure within women having elevated scores. The third was trained for the purpose of assessing seasonal transitions in a coastal bacterial community. Application of this model to a high-resolution time series permitted us to track the rate and time of community succession and accurately predict known ecosystem-level events.ConclusionBioMiCo provides a framework for learning the structure of microbial communities and for making predictions based on microbial assemblages. By training on carefully chosen features (abiotic or biotic), BioMiCo can be used to understand and predict transitions between complex communities composed of hundreds of microbial species.Electronic supplementary materialThe online version of this article (doi:10.1186/s40168-015-0073-x) contains supplementary material, which is available to authorized users.

Highlights

  • Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species

  • Bayesian inference of microbial communities (BioMiCo): a hierarchical mixed-membership model We model each microbiome sample as a mixture of operational taxonomic unit (OTU) from one or more communities by using K prespecified factors values as putative mixture components

  • Temporally stable strains have been detected within other longitudinal studies of the human gut [4,21,22], temporal stability could not be detected in this study [1]

Read more

Summary

Introduction

Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Sampling community composition via high-throughput amplicon sequencing, or via shotgun metagenomics, is no longer methodologically challenging, the data still pose a significant analytical challenge. Associations within such data are complex, the number of variables is huge, and species abundance information is sparse for many species (or strains) over many samples. In this setting, classic testing procedures have limited ability to identify complex features within the data [11,12]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.