Abstract

BackgroundStandard numbering schemes for families of homologous proteins allow for the unambiguous identification of functionally and structurally relevant residues, to communicate results on mutations, and to systematically analyse sequence-function relationships in protein families. Standard numbering schemes have been successfully implemented for several protein families, including lactamases and antibodies, whereas a numbering scheme for the structural family of thiamine-diphosphate (ThDP) -dependent decarboxylases, a large subfamily of the class of ThDP-dependent enzymes encompassing pyruvate-, benzoylformate-, 2-oxo acid-, indolpyruvate- and phenylpyruvate decarboxylases, benzaldehyde lyase, acetohydroxyacid synthases and 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexadiene-1-carboxylate synthase (MenD) is still missing.Despite a high structural similarity between the members of the ThDP-dependent decarboxylases, their sequences are diverse and make a pairwise sequence comparison of protein family members difficult.ResultsWe developed and validated a standard numbering scheme for the family of ThDP-dependent decarboxylases. A profile hidden Markov model (HMM) was created using a set of representative sequences from the family of ThDP-dependent decarboxylases. The pyruvate decarboxylase from S. cerevisiae (PDB: 2VK8) was chosen as a reference because it is a well characterized enzyme. The crystal structure with the PDB identifier 2VK8 encompasses the structure of the ScPDC mutant E477Q, the cofactors ThDP and Mg2+ as well as the substrate analogue (2S)-2-hydroxypropanoic acid. The absolute numbering of this reference sequence was transferred to all members of the ThDP-dependent decarboxylase protein family. Subsequently, the numbering scheme was integrated into the already established Thiamine-diphosphate dependent Enzyme Engineering Database (TEED) and was used to systematically analyze functionally and structurally relevant positions in the superfamily of ThDP-dependent decarboxylases.ConclusionsThe numbering scheme serves as a tool for the reliable sequence alignment of ThDP-dependent decarboxylases and the unambiguous identification and communication of corresponding positions. Thus, it is the basis for the systematic and automated analysis of sequence-encoded properties such as structural and functional relevance of amino acid positions, because the analysis of conserved positions, the identification of correlated mutations and the determination of subfamily specific amino acid distributions depend on reliable multisequence alignments and the unambiguous identification of the alignment columns. The method is reliable and robust and can easily be adapted to further protein families.

Highlights

  • Standard numbering schemes for families of homologous proteins allow for the unambiguous identification of functionally and structurally relevant residues, to communicate results on mutations, and to systematically analyse sequence-function relationships in protein families

  • This superfamily contains among others pyruvate decarboxylases (PDCs, EC 4.1.1.1), indolepyruvate decarboxylases (IPDCs, EC 4.1.1.74), phenyl pyruvate oxidases (POXs, EC 1.2.3.3), the E1 component of pyruvate dehydrogenases (PDHs, EC 1.2.4.1), oxalyl-CoA decarboxylases (OCDCs, EC 4.1.1.8), benzaldehyde lyases (BALs, EC 4.1.2.38), benzoylformate decarboxylases (BFDs, EC 4.1.1.7), acetohydroxyacid synthases (AHASs, EC 2.2.1.6), glyoxylate carboligases (GXCs, EC 4.1.1.47), sulfoacetaldehyde acetyltransferases (SAATs, EC 2.3.3.15), 2-hydroxyphytanoyl-CoA lyases (2-HPCLs) and 2-succinyl-5-enolpyruvyl-6-hydroxy-3cyclohexadiene-1-carboxylate synthase (SEPHCHC, MenD)

  • We present the establishment of a numbering scheme for the ThDPdependent decarboxylases based on the sequence of the well-documented pyruvate decarboxylase from S. cerevisiae (PDB: 2VK8 [6], Swissprot: P06169)

Read more

Summary

Introduction

Standard numbering schemes for families of homologous proteins allow for the unambiguous identification of functionally and structurally relevant residues, to communicate results on mutations, and to systematically analyse sequence-function relationships in protein families. Due to the scientific and industrial relevance of enzymes capable of catalysing C-C bond formation and cleavage, we have focused in this work on the decarboxylase superfamily of the ThDP-dependent Enzyme Engineering Database (TEED) [1]. This superfamily contains among others pyruvate decarboxylases (PDCs, EC 4.1.1.1), indolepyruvate decarboxylases (IPDCs, EC 4.1.1.74), phenyl pyruvate oxidases (POXs, EC 1.2.3.3), the E1 component of pyruvate dehydrogenases (PDHs, EC 1.2.4.1), oxalyl-CoA decarboxylases (OCDCs, EC 4.1.1.8), benzaldehyde lyases (BALs, EC 4.1.2.38), benzoylformate decarboxylases (BFDs, EC 4.1.1.7), acetohydroxyacid synthases (AHASs, EC 2.2.1.6), glyoxylate carboligases (GXCs, EC 4.1.1.47), sulfoacetaldehyde acetyltransferases (SAATs, EC 2.3.3.15), 2-hydroxyphytanoyl-CoA lyases (2-HPCLs) and 2-succinyl-5-enolpyruvyl-6-hydroxy-3cyclohexadiene-1-carboxylate synthase (SEPHCHC, MenD). Due to structural relations between this middle domain and the transhydrogenase domain dIII, this domain is called the TH3 domain [2,3]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call