Abstract

Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ∼3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process.

Highlights

  • Protein three-dimensional structures continue to be determined at an exponential rate, both due to improvements in structure determination techniques and the increase of involved investigators [1,2,3]

  • We developed the Evolutionary Classification Of protein Domains [7] as a hierarchal classification, which emphasizes distantly related homologs that are difficult to detect (H- and X-groups) and takes into account closer sequence-based relationships between protein domains that are placed in families

  • Evolutionary Classification Of protein Domains (ECOD) releases are coupled to Protein Data Bank (PDB) releases, such that for every week that there is a PDB release, there is an ECOD release

Read more

Summary

Introduction

Protein three-dimensional structures continue to be determined at an exponential rate, both due to improvements in structure determination techniques and the increase of involved investigators [1,2,3]. Each member of this set of peptide chains is individually queried against ECOD reference libraries using a combination of sequence (BLAST, HHsearch) and structural (DALI) aligners [9,10,11,12]. Curated chains were either assigned to ECOD using a combination of alignment data, functional considerations and/or topological similarities to known domains, or assigned to one of several special architectures, which annotate those residues that are either unclassifiable by our current methodology, or lack sufficient data to be classified in any case (i.e. low resolution structures, peptides, fragments).

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call