Abstract

Protein methylation, which plays vital roles in signal transduction and many cellular processes, is one of the most common protein post-translation modifications. Identification of methylation sites is very helpful for understanding the fundamental molecular mechanism of the methylation related biological processes. In silico predictions of methylation sites have emerged to be a powerful approach for methylation identifying. They also facilitate the performance of downstream characterizations and site-specific investigations. Herein, we proposed a novel strategy for the prediction of methylation sites based on a combination of the pseudo amino acid composition (PseAAC) and protein chain description as global features of protein sequence. The global features of protein sequence comprehensively utilize amino acid composition information and sequence-order information, along with the physicochemical properties and structural characteristics of amino acid information. Support vector machine (SVM) is invoked to build the prediction model for methylation sites on the basis of the global features of protein sequence. Meanwhile, a global stochastic optimization technique, particle swarm algorithm (PSO) is employed for effectively searching the optimal parameters in SVM. The prediction accuracy, sensitivity, specificity and Matthew's correlation coefficient values of the independent prediction set are 98.11%, 96.23%, 100% and 96.30%, respectively. It obviously indicates that our method has sufficient prediction effect in identification of the protein arginine methylation sites. As a comparison, other predictors are also constructed based on different feature extracting and modeling strategies. The results show that the proposed method can greatly improve the performance of arginine methylation sites prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call