Abstract

The prediction of ion ligand–binding residues in protein sequences is a challenging work that contributes to understand the specific functions of proteins in life processes. In this article, we selected binding residues of 14 ion ligands as research objects, including four acid radical ion ligands and 10 metal ion ligands. Based on the amino acid sequence information, we selected the composition and position conservation information of amino acids, the predicted structural information, and physicochemical properties of amino acids as basic feature parameters. We then performed a statistical analysis and reclassification for dihedral angle and proposed new methods on the extraction of feature parameters. The methods mainly included applying information entropy on the extraction of polarization charge and hydrophilic–hydrophobic information of amino acids and using position weight matrices on the extraction of position conservation information. In the prediction model, we used the random forest algorithm and obtained better prediction results than previous works. With the independent test, the Matthew's correlation coefficient and accuracy of 10 metal ion ligand–binding residues were larger than 0.07 and 52%, respectively; the corresponding evaluation values of four acid radical ion ligand–binding residues were larger than 0.15 and 86%, respectively. Further, we classified and combined the phi and psi angles and optimized prediction model for each ion ligand–binding residue.

Highlights

  • The protein is the foundation of life and plays an important role in the life activities

  • Composition feature and 2L-dimensional position conservation feature extracted from the phi and the psi angles were added to predict the binding residues of ion ligands

  • The results showed that the ion ligand–binding residues were sensitive to the information of the reclassified dihedral angle

Read more

Summary

Introduction

The protein is the foundation of life and plays an important role in the life activities. Because more than half of the proteins required binding with ion ligands for functions, research of ion ligand–binding residues on proteins was of great significance. It was difficult to accurately predict the ion ligand–binding residues on the protein sequence because of the small size and high versatility of ion ligands. Current theoretical prediction methods of ligand-binding residues can be roughly classified into sequence-based method and three-dimensional (3D) structure–based method. The experiments showed that the accuracy of the 3D structure–based prediction was higher than that of sequence-based prediction (Yang et al, 2013, 2018). The number of proteins with known amino acid sequence was far more than that with known 3D structure. The prediction accuracy of sequencebased method is not as satisfactory as 3D structure–based, sequence-based method is still generally used

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call