Abstract

Shotgun metagenomics has been applied to the studies of the functionality of various microbial communities. As a critical analysis step in these studies, biological pathways are reconstructed based on the genes predicted from metagenomic shotgun sequences. Pathway reconstruction provides insights into the functionality of a microbial community and can be used for comparing multiple microbial communities. The utilization of pathway reconstruction, however, can be jeopardized because of imperfect functional annotation of genes, and ambiguity in the assignment of predicted enzymes to biochemical reactions (e.g., some enzymes are involved in multiple biochemical reactions). Considering that metabolic functions in a microbial community are carried out by many enzymes in a collaborative manner, we present a probabilistic sampling approach to profiling functional content in a metagenomic dataset, by sampling functions of catalytically promiscuous enzymes within the context of the entire metabolic network defined by the annotated metagenome. We test our approach on metagenomic datasets from environmental and human-associated microbial communities. The results show that our approach provides a more accurate representation of the metabolic activities encoded in a metagenome, and thus improves the comparative analysis of multiple microbial communities. In addition, our approach reports likelihood scores of putative reactions, which can be used to identify important reactions and metabolic pathways that reflect the environmental adaptation of the microbial communities. Source code for sampling metabolic networks is available online at http://omics.informatics.indiana.edu/mg/MetaNetSam/.

Highlights

  • Metagenomics aims to analyze the microbial communities directly extracted from their living environment, bypassing the requirements of isolating and culturing the microbes

  • Functional annotations are often achieved by similarity search against gene families collected in the databases of biological pathways, such as Kyoto Encyclopedia of Genes and Genomes (KEGG) [6], MetaCyc [7], or SEED [8] so that biological pathways can be reconstructed from the predicted functions

  • We present a probabilistic sampling approach to profiling metabolic reactions in a microbial community from metagenomic shotgun reads, in an attempt to understand the metabolism within a microbial community and compare them across multiple communities

Read more

Summary

Introduction

Metagenomics aims to analyze the microbial communities directly extracted from their living environment, bypassing the requirements of isolating and culturing the microbes. The list of metagenomics studies is growing rapidly [1,2]. This provides ample opportunities for researchers to develop new computational methods to analyze the sequences from metagenomics projects. To understand the functional and metabolic potential of a microbial community given the sequencing data, a key analysis is to predict - from raw NGS reads or assembled contigs - protein coding genes and their functions. The principle is the same, different annotation systems may use different practices: for example, the HUMAnN pipeline directly predict gene families and pathways from short sequence reads based on similarity searches [9], while MG-RAST first predicts protein coding region from short reads de novo, and predicts the functions of the predicted proteins based on similarity searches [10]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call