Abstract

A wide range of research areas in molecular biology and medical biochemistry require a reliable enzyme classification system, e.g., drug design, metabolic network reconstruction and system biology. When research scientists in the above mentioned areas wish to unambiguously refer to an enzyme and its function, the EC number introduced by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) is used. However, each and every one of these applications is critically dependent upon the consistency and reliability of the underlying data for success. We have developed tools for the validation of the EC number classification scheme. In this paper, we present validated data of 3788 enzymatic reactions including 229 sub-subclasses of the EC classification system. Over 80% agreement was found between our assignment and the EC classification. For 61 (i.e., only 2.5%) reactions we found that their assignment was inconsistent with the rules of the nomenclature committee; they have to be transferred to other sub-subclasses. We demonstrate that our validation results can be used to initiate corrections and improvements to the EC number classification scheme.

Highlights

  • With the several thousand proteins found in each organism a highly developed hierarchical and consistent classification scheme is absolutely essential for a comparison of metabolic capacities of the organisms. Such a system exists only for the enzymes and not for the other protein classes but for the enzymes the classification scheme allows an immediate access or the enzyme functional properties including catalysed reaction, substrate specificity, etc. In this respect a quick comparative assessment of enzymatic pathways between organisms is possible even when the enzymes in the different organisms have totally different sequences as long as they belong to the same EC-class

  • The main databases to be taken into account to provide a complete cross-link between genes and their corresponding enzymes are NCBI EntrezGene [3], Ensembl [4], Kyoto Encyclopedia of Genes and Genomes (KEGG) [5], MetaCyc [6] and BRENDA [7]

  • A further problem is the wide-spread use of incomplete EC numbers such as 1.-.-.-. This often occurs because an enzymatic function is inferred from the existence of a certain pair of metabolites or only experimentally shown from a cell extract without a full characterisation of the enzyme with biochemical methods, which is the requirement for the assignment of ECnumbers by the International Union of Biochemistry and Molecular Biology (IUBMB) Nomenclature Committee [9]

Read more

Summary

Introduction

With the several thousand proteins found in each organism a highly developed hierarchical and consistent classification scheme is absolutely essential for a comparison of metabolic capacities of the organisms. A further problem is the wide-spread use of incomplete EC numbers such as 1.-.-.- (e.g. in UNIPROT entry AK1C3_HUMAN) This often occurs because an enzymatic function is inferred from the existence of a certain pair of metabolites or only experimentally shown from a cell extract without a full characterisation of the enzyme with biochemical methods, which is the requirement for the assignment of ECnumbers by the IUBMB Nomenclature Committee [9]. The Kyoto Encyclopedia of Genes and Genomes (KEGG) developed a tool for computational assignment of EC numbers published by Kotera et al [11] In this approach each reaction formula is decomposed by manual work into sets of corresponding substrate and product molecules, which are called reactant pairs.

Author Summary
Conclusions
Findings
Materials And Methods
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call