Abstract

Although protein phosphorylation sites can be reliably identified with high-resolution mass spectrometry, the experimental approach is time-consuming and resource-dependent. Furthermore, it is unlikely that an experimental approach could catalog an entire phosphoproteome. Computational prediction of phosphorylation sites provides an efficient and flexible way to reveal potential phosphorylation sites and provide hypotheses in experimental design. Musite is a tool that we previously developed to predict phosphorylation sites based solely on protein sequence. However, it was not comprehensively applied to plants. In this study, the phosphorylation data from Arabidopsis thaliana, B. napus, G. max, M. truncatula, O. sativa, and Z. mays were collected for cross-species testing and the overall plant-specific prediction as well. The results show that the model for A. thaliana can be extended to other organisms, and the overall plant model from Musite outperforms the current plant-specific prediction tools, Plantphos, and PhosphAt, in prediction accuracy. Furthermore, a comparative study of predicted phosphorylation sites across orthologs among different plants was conducted to reveal potential evolutionary features. A bipolar distribution of isolated, non-conserved phosphorylation sites, and highly conserved ones in terms of the amino acid type was observed. It also shows that predicted phosphorylation sites conserved within orthologs do not necessarily share more sequence similarity in the flanking regions than the background, but they often inherit protein disorder, a property that does not necessitate high sequence conservation. Our analysis also suggests that the phosphorylation frequencies among serine, threonine, and tyrosine correlate with their relative proportion in disordered regions. Musite can be used as a web server (http://musite.net) or downloaded as an open-source standalone tool (http://musite.sourceforge.net/).

Highlights

  • Protein phosphorylation plays important roles in numerous cellular processes in plants

  • We recently developed Musite (Gao et al, 2010), which incorporates feature selection and machine-learning processes as well as other useful tools into one open-source frame work

  • DATASETS The phosphorylation sites being analyzed were from six organisms, i.e., A. thaliana (Nuhse et al, 2004, 2007; Wolschin and Weckwerth, 2005; de la Fuente van Bentem et al, 2006, 2008; Benschop et al, 2007; Sugiyama et al, 2008; Whiteman et al, 2008; Hsu et al, 2009; Ito et al, 2009; Jones et al, 2009; Li et al, 2009; Reiland et al, 2009; Wang et al, 2009; Chen et al, 2010; Kline et al, 2010; Nakagami et al, 2010; Engelsberger and Schulze, 2012; Meyer et al, 2012), B. napus (Meyer et al, 2012), G. max (Meyer et al, 2012), M. truncatula (Grimsrud et al, 2010), O. sativa (Nakagami et al, 2010), and Z. mays (Bi et al, 2011)

Read more

Summary

Introduction

Protein phosphorylation plays important roles in numerous cellular processes in plants. Mass spectrometry based studies have provided high-throughput phosphorylation data, it is still time-consuming and expensive to identify phosphorylation sites experimentally. Computational prediction of phosphorylation sites directly from protein sequences provides an alternative approach. We recently developed Musite (Gao et al, 2010), which incorporates feature selection and machine-learning processes as well as other useful tools into one open-source frame work. It is computationally efficient, offers a statistical assessment of data quality, and can handle proteome-wide prediction

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.