Abstract

Comparative analysis of the sequences of enzymes encoded in a variety of prokaryotic and eukaryotic genomes reveals convergence and divergence at several levels. Functional convergence can be inferred when structurally distinct and hence non-homologous enzymes show the ability to catalyze the same biochemical reaction. In contrast, as a result of functional diversification, many structurally similar enzyme molecules act on substantially distinct substrates and catalyze diverse biochemical reactions. Here, we present updates on the ATP-grasp, alkaline phosphatase, cupin, HD hydrolase, and N-terminal nucleophile (Ntn) hydrolase enzyme superfamilies and discuss the patterns of sequence and structural conservation and diversity within these superfamilies. Typically, enzymes within a superfamily possess common sequence motifs and key active site residues, as well as (predicted) reaction mechanisms. These observations suggest that the strained conformation (the entatic state) of the active site, which is responsible for the substrate binding and formation of the transition complex, tends to be conserved within enzyme superfamilies. The subsequent fate of the transition complex is not necessarily conserved and depends on the details of the structures of the enzyme and the substrate. This variability of reaction outcomes limits the ability of sequence analysis to predict the exact enzymatic activities of newly sequenced gene products. Nevertheless, sequence-based (super)family assignments and generic functional predictions, even if imprecise, provide valuable leads for experimental studies and remain the best approach to the functional annotation of uncharacterized proteins from new genomes.

Highlights

  • The availability of complete genome sequences of numerous bacteria, archaea, and eukaryotes has fundamentally transformed modern biology

  • We consider the two key processes in enzyme evolution, namely sequence divergence, which leads to functional diversification within the same protein superfamily, and functional convergence, which results in members of distinct superfamilies being recruited to catalyze the same metabolic reaction

  • The availability of complete genome sequences of diverse bacteria, archaea, and eukaryotes illuminated the unexpected diversity of protein sequences encoded in those genomes

Read more

Summary

Functional Diversification of Protein Superfamilies

Proteins were unified in families based on sequence similarity [8]. Protein families were combined into superfamilies based on similar catalytic activities, sequence motifs, and other conserved features [9, 10]. The current classifications of protein structural (super)families, implemented in the popular SCOP, CATH, and Dali databases, are generally compatible with each other despite the differences between the underlying methodologies [11,12,13] These superfamilies often correspond to sequence-based domain families (or clans) in the Pfam database [14] and contain conserved sequence motifs that are represented in such databases as InterPro [15]. These superfamilies span a wide range of sequence and structure conservation and provide multiple examples of divergence and convergence in the evolution of enzymes. An expanded version of this table that includes EC numbers, references, and hyperlinks to related databases is available in supplemental Table S1 as well as on the NCBI ftp site (ftp.ncbi.nih.gov/pub/galperin/ EnzymeSuperfamilies.html). aa, amino acids; GPI, glycosylphosphatidylinositol; fGly, formylglycine

Common traits of superfamily members
Did Evolution Favor Conservation of Entatic State?
Practical Aspects of Superfamily Assignment
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call