Abstract

Recognition of the phosphorylation sites in proteins is required for reconstruction of regulatory processes in living systems. This task is complicated because the phosphorylation motifs in amino acid sequences are considerably degenerated. To improve the prediction efficacy researchers often use additional descriptors, which should reflect physicochemical features of site-surrounding regions. We have evaluated the reasonability of this approach by applying molecular descriptors (MNA) for structural presentation of the peptide segments. Comparative testing was performed using the prognostic method PASS and two input data types: sets of the MNA descriptors represented peptides as chemical structures and amino acid sequences written using a one-letter code. Training sets were classified in accordance with the established types of the enzymes (protein kinases), modifying corresponding phosphorylation sites. The accuracy estimates obtained by prognosis validation for various classes of substrates were significantly different with both the letters and molecular descriptors. In case of the letter description, the prognosis accuracy demonstrated less dependence on the length of peptides in the training set, while in the case of structural descriptors the accuracy level was determined by the peptide size and descriptor characteristics (MNA levels). The maximal prognosis accuracy related to various kinase families was achieved at different sizes of molecular fragments covered by the MNA descriptors of corresponding levels. This obviously reflected structural differences in surroundings of phosphorylation sites modified by various protein kinases. The use of molecular descriptors provided the prognostic results comparable with the results obtained using traditional letter representation. The prognosis accuracy demonstrated less dependence on the method describing site-surrounding peptides at higher accuracy rates. Applying the MNA descriptors it is possible to achieve better accuracy in the cases when the letter description cannot provide acceptable accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call