Abstract

The regulation of protein function by modulating the surface charge status via sequence-locally enriched phosphorylation sites (P-sites) in so called phosphorylation “hotspots” has gained increased attention in recent years. We set out to identify P-hotspots in the model plant Arabidopsis thaliana. We analyzed the spacing of experimentally detected P-sites within peptide-covered regions along Arabidopsis protein sequences as available from the PhosPhAt database. Confirming earlier reports (Schweiger and Linial, 2010), we found that, indeed, P-sites tend to cluster and that distributions between serine and threonine P-sites to their respected closest next P-site differ significantly from those for tyrosine P-sites. The ability to predict P-hotspots by applying available computational P-site prediction programs that focus on identifying single P-sites was observed to be severely compromised by the inevitable interference of nearby P-sites. We devised a new approach, named HotSPotter, for the prediction of phosphorylation hotspots. HotSPotter is based primarily on local amino acid compositional preferences rather than sequence position-specific motifs and uses support vector machines as the underlying classification engine. HotSPotter correctly identified experimentally determined phosphorylation hotspots in A. thaliana with high accuracy. Applied to the Arabidopsis proteome, HotSPotter-predicted 13,677 candidate P-hotspots in 9,599 proteins corresponding to 7,847 unique genes. Hotspot containing proteins are involved predominantly in signaling processes confirming the surmised modulating role of hotspots in signaling and interaction events. Our study provides new bioinformatics means to identify phosphorylation hotspots and lays the basis for further investigating novel candidate P-hotspots. All phosphorylation hotspot annotations and predictions have been made available as part of the PhosPhAt database at http://phosphat.mpimp-golm.mpg.de.

Highlights

  • Protein phosphorylation is one of the most significant and best characterized posttranslational modifications involved a wide range of molecular regulatory and signaling mechanisms (Johnson, 2009)

  • In a recent survey of amino acid changing polymorphisms across many Arabidopsis thaliana accessions, it was concluded that serine, threonine, and tyrosine sites associated with phosphorylation events were statistically more conserved than their non-phosphorylated counterparts

  • While for real phosphorylation sites (P-sites), about 10% of all P-neighbor distances are at dN = 1 and close to 50% of pairwise distances are found within dN < 6, the equivalent number is only about 1% for randomized sequences and only about 6% of all distance intervals at dN < 6

Read more

Summary

Introduction

Protein phosphorylation is one of the most significant and best characterized posttranslational modifications involved a wide range of molecular regulatory and signaling mechanisms (Johnson, 2009). Their exact location along the protein sequence, and in the three-dimensional structure, was thought to be a key determinant for their exerted function, for example, by inducing conformational changes of the associated protein allowing for allosteric regulation (Barford et al, 1991) In agreement with this view, functionally relevant P-sites have been observed to be conserved in evolution such that their position in the sequence and structure of proteins is maintained across different bacteria (Macek et al, 2007), plants (Nakagami et al, 2010), vertebrates (Malik et al, 2008), and eukaryotes in general (Boekhorst et al, 2008). Given that several thousands of positions were included in the test, this weak significance is rather surprising

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.