Abstract

BackgroundA convergence of high-throughput sequencing and computational power is transforming biology into information science. Despite these technological advances, converting bits and bytes of sequence information into meaningful insights remains a challenging enterprise. Biological systems operate on multiple hierarchical levels from genomes to biomes. Holistic understanding of biological systems requires agile software tools that permit comparative analyses across multiple information levels (DNA, RNA, protein, and metabolites) to identify emergent properties, diagnose system states, or predict responses to environmental change.ResultsHere we adopt the MetaPathways annotation and analysis pipeline and Pathway Tools to construct environmental pathway/genome databases (ePGDBs) that describe microbial community metabolism using MetaCyc, a highly curated database of metabolic pathways and components covering all domains of life. We evaluate Pathway Tools’ performance on three datasets with different complexity and coding potential, including simulated metagenomes, a symbiotic system, and the Hawaii Ocean Time-series. We define accuracy and sensitivity relationships between read length, coverage and pathway recovery and evaluate the impact of taxonomic pruning on ePGDB construction and interpretation. Resulting ePGDBs provide interactive metabolic maps, predict emergent metabolic pathways associated with biosynthesis and energy production and differentiate between genomic potential and phenotypic expression across defined environmental gradients.ConclusionsThis multi-tiered analysis provides the user community with specific operating guidelines, performance metrics and prediction hazards for more reliable ePGDB construction and interpretation. Moreover, it demonstrates the power of Pathway Tools in predicting metabolic interactions in natural and engineered ecosystems.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-619) contains supplementary material, which is available to authorized users.

Highlights

  • A convergence of high-throughput sequencing and computational power is transforming biology into information science

  • Resulting annotations are used by the PathoLogic algorithm implemented in Pathway Tools to predict metabolic pathways based on multiple criteria including proportion of pathways found, pathway specific enzymatic reactions, and purported taxon-specific pathway distributions

  • Simulations on increasing proportions of the total component genome length (Gm) showed that the performance of pathway recovery based on multiple metrics (F-measure, Matthews Correlation Coefficient, etc.) increased with sequence coverage and sample diversity nearing an asymptote at higher coverage (Figure 2a)

Read more

Summary

Introduction

A convergence of high-throughput sequencing and computational power is transforming biology into information science. New technologies are rapidly expanding our capacity to chart microbial sequence space, persistent computational and analytical bottlenecks impede comparative analyses across multiple information levels (DNA, RNA, protein and metabolites) [4,5]. This in turn limits our ability to convert the genetic. Functional genes operate within the structure of metabolic pathways and reactions that define metabolic networks. Despite this fact, few metagenomic studies use pathway-centric approaches to predict microbial community interaction networks based on known biochemical rules. Neither HUMAnN nor PRMT provides a coherent structure for exploring and interpreting predicted KEGG pathways

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.