Abstract

The human microbiome consists of thousands of different microbial species, and tens of thousands of bioactive small molecules are associated with them. These associated molecules include the biosynthetic products of microbiota and the products of microbial transformation of host molecules, dietary components, and pharmaceuticals. The existing methods for characterization of these small molecules are currently time consuming and expensive, and they are limited to the cultivable bacteria. Here, we propose a method for detecting microbiota-associated small molecules based on the patterns of cooccurrence of molecular and microbial features across multiple microbiomes. We further map each molecule to the clade in a phylogenetic tree that is responsible for its production/transformation. We applied our proposed method to the tandem mass spectrometry and metagenomics data sets collected by the American Gut Project and to microbiome isolates from cystic fibrosis patients and discovered the genes in the human microbiome responsible for the production of corynomycolenic acid, which serves as a ligand for human T cells and induces a specific immune response against infection. Moreover, our method correctly associated pseudomonas quinolone signals, tyrvalin, and phevalin with their known biosynthetic gene clusters.IMPORTANCE Experimental advances have enabled the acquisition of tandem mass spectrometry and metagenomics sequencing data from tens of thousands of environmental/host-oriented microbial communities. Each of these communities contains hundreds of microbial features (corresponding to microbial species) and thousands of molecular features (corresponding to microbial natural products). However, with the current technology, it is very difficult to identify the microbial species responsible for the production/biotransformation of each molecular feature. Here, we develop association networks, a new approach for identifying the microbial producer/biotransformer of natural products through cooccurrence analysis of metagenomics and mass spectrometry data collected on multiple microbiomes.

Highlights

  • IMPORTANCE Experimental advances have enabled the acquisition of tandem mass spectrometry and metagenomics sequencing data from tens of thousands of environmental/host-oriented microbial communities

  • The American Gut Project (AGP) data set consists of liquid chromatography-mass spectrometry (LC-mass spectrometry (MS))/MS and 16S rRNA data collected from the human gut microbiomes of 2,125 subjects

  • We further mapped shotgun metagenomics data collected on samples with pseudomonas quinolone signals (PQS) present against PQS biosynthetic gene cluster (BGC), and we identified 2,472 out of 2,488,704 reads mapped to PQS BGC

Read more

Summary

Introduction

IMPORTANCE Experimental advances have enabled the acquisition of tandem mass spectrometry and metagenomics sequencing data from tens of thousands of environmental/host-oriented microbial communities. The majority of known microbial products and biotransformation products are discovered through the targeted analysis of specific molecules, such as short-chain fatty acids, secondary bile acids, and oral drugs in model systems (e.g., mice with a controlled diet and environment) [5,6,7] These methods do not generalize to complex communities like the human microbiome, where it is impossible to control environmental factors. Recent large-scale microbiome data sets, such as the Integrative Human Microbiome Project (iHMP) [8] and the American Gut Project (AGP) [9], collect microbial and molecular abundance profiles over thousands of human microbiota samples, providing us with an unprecedented opportunity to explore the interactions between microorganisms, enzymes, and molecules in complex communities. There is no consensus on how to extract features from LC-MS data or what association test should be used

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call