Abstract
Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.
Highlights
The ultimate goal of genome and metagenome analysis is the biological interpretation of the genome sequences in terms of biochemical capabilities of organisms and their niche-specific adaptations including generation of testable hypotheses about their physiological characteristics
Several genome analysis resources, such as SEED [1], MicrobesOnline [2], PATRIC [3], KEGG [4], and MetaCyc [5] support biological interpretation of microbial genomes and/or metagenomes by integrating diverse data ranging from nucleotide and protein sequences to various catalogs of protein families and functional roles, to databases of chemical compounds and reactions
Integrated Microbial Genomes (IMG) pathways provide the context needed for predicting phenotypes within the IMG system
Summary
The ultimate goal of genome and metagenome analysis is the biological interpretation of the genome sequences in terms of biochemical capabilities of organisms and their niche-specific adaptations including generation of testable hypotheses about their physiological characteristics This process entails associating genes with functional roles which describe their enzymatic activities, involvement in various macromolecular interactions and regulatory processes. Several genome analysis resources, such as SEED [1], MicrobesOnline [2], PATRIC [3], KEGG [4], and MetaCyc [5] support biological interpretation of microbial genomes and/or metagenomes by integrating diverse data ranging from nucleotide and protein sequences to various catalogs of protein families and functional roles, to databases of chemical compounds and reactions Most of these resources maintain computational pipelines that assign functional roles to genes and infer the presence of reactions and pathways.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have