Abstract

BackgroundAs a result of the growing body of protein phosphorylation sites data, the number of phosphoprotein databases is constantly increasing, and dozens of tools are available for predicting protein phosphorylation sites to achieve fast automatic results. However, none of the existing tools has been developed to predict protein phosphorylation sites in rice.ResultsIn this paper, the phosphorylation site predictors, NetPhos 2.0, NetPhosK, Kinasephos, Scansite, Disphos and Predphosphos, were integrated to construct meta-predictors of rice-specific phosphorylation sites using several methods, including unweighted voting, unreduced weighted voting, reduced unweighted voting and weighted voting strategies. PhosphoRice, the meta-predictor produced by using weighted voting strategy with parameters selected by restricted grid search and conditional random search, performed the best at predicting phosphorylation sites in rice. Its Matthew's Correlation Coefficient (MCC) and Accuracy (ACC) reached to 0.474 and 73.8%, respectively. Compared to the best individual element predictor (Disphos_default), PhosphoRice archieved a significant increase in MCC of 0.071 (P < 0.01), and an increase in ACC of 4.6%.ConclusionsPhosphoRice is a powerful tool for predicting unidentified phosphorylation sites in rice. Compared to the existing methods, we found that our tool showed greater robustness in ACC and MCC. PhosphoRice is available to the public at http://bioinformatics.fafu.edu.cn/PhosphoRice.

Highlights

  • As a result of the growing body of protein phosphorylation sites data, the number of phosphoprotein databases is constantly increasing, and dozens of tools are available for predicting protein phosphorylation sites to achieve fast automatic results

  • Phosida contains large-scale data from Homo sapien and Bacillus subtilis [15], PhosphoSite is a curated site that focuses on vertebrate systems [16] and PhosPhAt is a phosphorylation site database that is specific for Arabidopsis [17]

  • Preprocessing performance assessment of element predictors All of the protein sequences in the dataset were run through all 15 element predictors

Read more

Summary

Introduction

As a result of the growing body of protein phosphorylation sites data, the number of phosphoprotein databases is constantly increasing, and dozens of tools are available for predicting protein phosphorylation sites to achieve fast automatic results. None of the existing tools has been developed to predict protein phosphorylation sites in rice. The growing data of protein phosphorylation sites have stimulated the development of computational approaches to predict these sites from protein sequences. The existing protein phosphorylation site prediction tools show a data sampling bias. PhosPhAt, which predicts phosphorylated-Serine sites in Arabidopsis, is benchmarked to perform better with Arabidopsis sequences than other generic predictors [17]. No existing methods have been developed to predict protein phosphorylation sites in rice

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call