Abstract
The automatic identification of catalytic residues still remains an important challenge in structural bioinformatics. Sequence-based methods are good alternatives when the query shares a high percentage of identity with a well-annotated enzyme. However, when the homology is not apparent, which occurs with many structures from the structural genome initiative, structural information should be exploited. A local structural comparison is preferred to a global structural comparison when predicting functional residues. CMASA is a recently proposed method for predicting catalytic residues based on a local structure comparison. The method achieves high accuracy and a high value for the Matthews correlation coefficient. However, point substitutions or a lack of relevant data strongly affect the performance of the method. In the present study, we propose a simple extension to the CMASA method to overcome this difficulty. Extensive computational experiments are shown as proof of concept instances, as well as for a few real cases. The results show that the extension performs well when the catalytic site contains mutated residues or when some residues are missing. The proposed modification could correctly predict the catalytic residues of a mutant thymidylate synthase, 1EVF. It also successfully predicted the catalytic residues for 3HRC despite the lack of information for a relevant side chain atom in the PDB file.
Highlights
The automatic annotation of protein functions is a challenging problem in structural bioinformatics
Enzymatic functions are utilized in almost all biological processes; these functions are usually related to a few catalytic residues
Two test sets (A and B) are proposed. Each of these test sets contains a positive group of proteins and a negative group
Summary
The automatic annotation of protein functions is a challenging problem in structural bioinformatics. CMASA implements an algorithm that compares a database of catalytic residues and their 3D structure with the query protein. The 124 excluded proteins show one of the following characteristics: N Differences in the number of catalytic residues among members of the same family and their master template.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have