Abstract

Understanding which are the catalytic residues in an enzyme and what function they perform is crucial to many biology studies, particularly those leading to new therapeutics and enzyme design. The original version of the Catalytic Site Atlas (CSA) (http://www.ebi.ac.uk/thornton-srv/databases/CSA) published in 2004, which catalogs the residues involved in enzyme catalysis in experimentally determined protein structures, had only 177 curated entries and employed a simplistic approach to expanding these annotations to homologous enzyme structures. Here we present a new version of the CSA (CSA 2.0), which greatly expands the number of both curated (968) and automatically annotated catalytic sites in enzyme structures, utilizing a new method for annotation transfer. The curated entries are used, along with the variation in residue type from the sequence comparison, to generate 3D templates of the catalytic sites, which in turn can be used to find catalytic sites in new structures. To ease the transfer of CSA annotations to other resources a new ontology has been developed: the Enzyme Mechanism Ontology, which has permitted the transfer of annotations to Mechanism, Annotation and Classification in Enzymes (MACiE) and UniProt Knowledge Base (UniProtKB) resources. The CSA database schema has been re-designed and both the CSA data and search capabilities are presented in a new modern web interface.

Highlights

  • Enzymes represent $45% of the collective protein products of all the genomes cataloged by resources such as the UniProt Knowledge Base (UniProtKB) (1)

  • The Catalytic Site Atlas (CSA) (2) was established to provide curated annotations of the small number of highly conserved residues that are directly involved in undertaking the catalytic activity in enzymes whose structures have been deposited in the Protein Data Bank (PDB) (3)

  • We have developed a new ontology, the Enzyme Mechanism Ontology (EMO), permitting the integration of CSA information into both MACiE and UniProtKB data structures and can be used as a controlled vocabulary for describing aspects of protein sequence and structure with chemistry and mechanistic terms across resources

Read more

Summary

INTRODUCTION

Enzymes represent $45% of the collective protein products of all the genomes cataloged by resources such as the UniProt Knowledge Base (UniProtKB) (1). The Catalytic Site Atlas (CSA) (2) was established to provide curated annotations of the small number of highly conserved residues that are directly involved in undertaking the catalytic activity in enzymes whose structures have been deposited in the Protein Data Bank (PDB) (3). These curated entries can in turn be used for inferring catalytic residues in other enzyme structures through homology, using a simple PSIBlast method. The CSA 2.0 provides a manually curated resource of 968 enzyme structures and their catalytic sites including information on the functional part of each catalytic residue and its role in the enzyme mechanism. Though descriptions and definitions of some of the information held in all three databases are made in existing ontologies such as GO (11) and the ChEBI (12) ontology, marrying these

Introduction
Findings
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call