Abstract
BackgroundNext-generation sequencing and metagenome projects yield a large number of new genomes that need further annotations, such as identification of enzymes and metabolic pathways, or analysis of metabolic strategies of newly sequenced species in comparison to known organisms. While methods for enzyme identification are available, development of the command line tools for high-throughput comparative analysis and visualization of identified enzymes is lagging.MethodsA set of perl scripts has been developed to perform automated data retrieval from the KEGG database using its new REST program application interface. Enrichment or depletion in metabolic pathways is evaluated using the two-tailed Fisher exact test followed by Benjamini and Hochberg correction.ResultsComparative analysis of a given set of enzymes with a specified reference organism includes mapping to known metabolic pathways, finding shared and unique enzymes, generating links to visualize maps at KEGG Pathway, computing enrichment of the pathways, listing the non-mapped enzymes.ConclusionsEC2KEGG provides a platform independent toolkit for automated comparison of identified sets of enzymes from newly sequenced organisms against annotated reference genomes. The tool can be used both for manual annotations of individual species and for high-throughput annotations as part of a computational pipeline. The tool is publicly available at http://sourceforge.net/projects/ec2kegg/.
Highlights
Next-generation sequencing and metagenome projects yield a large number of new genomes that need further annotations, such as identification of enzymes and metabolic pathways, or analysis of metabolic strategies of newly sequenced species in comparison to known organisms
Information about organism specific genes, enzymes, and pathways is automatically retrieved from the KEGG database using its new representational state transfer application programming interface
The number of genes in a pathway is defined by KEGG annotation for a given reference organism
Summary
Next-generation sequencing and metagenome projects yield a large number of new genomes that need further annotations, such as identification of enzymes and metabolic pathways, or analysis of metabolic strategies of newly sequenced species in comparison to known organisms. Annotation of individual genomes with respect to identification of enzymes has been well developed and implemented in various packages, such as PRIAM [1], SHARKhunt [2], Blast2GO [3] These tools do not provide comparative analysis of metabolic pathways. Cincinnati, OH 45229, USA between different organisms with subsequent visualization of results This limitation has been addressed by some approaches, such as Comparative Pathway Analyzer [4] or ComPath [5]. The most up-to-date and fully operational web-server currently available to achieve these tasks is KEGG Mapper (http://www.kegg.jp/kegg/mapper.html)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.