Abstract

Prediction of protein catalytic residues provides useful information for the studies of protein functions. Most of the existing methods combine both structure and sequence information but heavily rely on sequence conservation from multiple sequence alignments. The contribution of structure information is usually less than that of sequence conservation in existing methods. We found a novel structure feature, residue side chain orientation, which is the first structure-based feature that achieves prediction results comparable to that of evolutionary sequence conservation. We developed a structure-based method, Enzyme Catalytic residue SIde-chain Arrangement (EXIA), which is based on residue side chain orientations and backbone flexibility of protein structure. The prediction that uses EXIA outperforms existing structure-based features. The prediction quality of combing EXIA and sequence conservation exceeds that of the state-of-the-art prediction methods. EXIA is designed to predict catalytic residues from single protein structure without needing sequence or structure alignments. It provides invaluable information when there is no sufficient or reliable homology information for target protein. We found that catalytic residues have very special side chain orientation and designed the EXIA method based on the newly discovered feature. It was also found that EXIA performs well for a dataset of enzymes without any bounded ligand in their crystallographic structures.

Highlights

  • Due to the advances of structural genomics project, the number of protein structures determined is increasing rapidly

  • We show the prediction results on enzymes of single catalytic residue and the predictions results on a dataset of enzyme structures intrinsically without any bounded ligand

  • To evaluate the performance of the EXIA method, we compared the prediction results using EXIA without sequence conservation with the most recent and successful structure-based prediction method, the Partial Order Optimum Likelihood (POOL) method [14], which achieved the best performance among the methods using structure information only

Read more

Summary

Introduction

Due to the advances of structural genomics project, the number of protein structures determined is increasing rapidly. An informationtheoretic approach for estimating sequence conservation based on Jensen–Shannon divergence was used to predict catalytic residues from protein sequence [1]. Phylogenetic motifs, sequence regions conserving the overall familial phylogeny was shown to be a promising feature for protein functional site prediction [2]. Sequence conservation and 3D-profile, including cleft shape, stability, and electrostatic potential, generated from known enzyme structures was used to identify catalytic sites [3]. Another method detects specific conservation patterns near known catalytic residues on sequence and constrains what combination of amino acids can exist near a predicted catalytic residue [4]. There are situations that proteins of the same function have quite different tertiary structures [9]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.