Abstract

eggNOG is a public resource that provides Orthologous Groups (OGs) of proteins at different taxonomic levels, each with integrated and summarized functional annotations. Developments since the latest public release include changes to the algorithm for creating OGs across taxonomic levels, making nested groups hierarchically consistent. This allows for a better propagation of functional terms across nested OGs and led to the novel annotation of 95 890 previously uncharacterized OGs, increasing overall annotation coverage from 67% to 72%. The functional annotations of OGs have been expanded to also provide Gene Ontology terms, KEGG pathways and SMART/Pfam domains for each group. Moreover, eggNOG now provides pairwise orthology relationships within OGs based on analysis of phylogenetic trees. We have also incorporated a framework for quickly mapping novel sequences to OGs based on precomputed HMM profiles. Finally, eggNOG version 4.5 incorporates a novel data set spanning 2605 viral OGs, covering 5228 proteins from 352 viral proteomes. All data are accessible for bulk downloading, as a web-service, and through a completely redesigned web interface. The new access points provide faster searches and a number of new browsing and visualization capabilities, facilitating the needs of both experts and less experienced users. eggNOG v4.5 is available at http://eggnog.embl.de.

Highlights

  • Orthology and paralogy are central concepts in evolutionary biology. They allow distinguishing between molecular sequences that, despite sharing a common ancestry, evolved by different mechanisms: orthologs are the result of speciation events, whereas paralogs originate from gene duplications. This distinction is widely used in molecular biology, since the evolutionary forces shaping the respective classes of sequences are profoundly different and impact the analysis of functional divergence [1]

  • The most notable ones include (i) modifications to the clustering algorithm in order to make Orthologous Groups (OGs) hierarchically consistent across taxonomic levels, (ii) improved annotation of OGs, (iii) the availability of Hidden Markov Models (HMMs)-based tools for fast protein sequence assignment to OGs, (iv) the addition of viral OGs, (v) the availability of fine-grained orthology inferences derived from phylogenetic analysis, (vi) a completely re-designed web interface and (vii) programmatic access through a RESTful Application Programming Interface (API). eggNOG v4.5 is available at http://eggnog.embl.de

  • At the taxonomic level of each OG, available functional annotations are collected from many sources including free-text descriptions in source genome databases, COG functional categories [8], Gene Ontology terms [42], KEGG pathways [43] and SMART/Pfam protein domains [44,45]

Read more

Summary

Introduction

Orthology and paralogy are central concepts in evolutionary biology. They allow distinguishing between molecular sequences that, despite sharing a common ancestry, evolved by different mechanisms: orthologs are the result of speciation events, whereas paralogs originate from gene duplications. Graph-based algorithms allow analysis of more species at once and produce groups of orthologous sequences with the common ancestor defined by the set of species considered at the taxonomic level.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.