Abstract

Glycation is a non-enzymatic process occurring inside or outside the host body by attaching a sugar molecule to a protein or lipid molecule. It is an important form of post-translational modification (PTM), which impairs the function and changes the characteristics of the proteins so that the identification of the glycation sites may provide some useful guidelines to understand various biological functions of proteins. In this study, we proposed an accurate prediction tool, named Glypre, for lysine glycation. Firstly, we used multiple informative features to encode the peptides. These features included the position scoring function, secondary structure, AAindex, and the composition of k-spaced amino acid pairs. Secondly, the distribution of distinctive features of the residues surrounding the glycation and non-glycation sites was statistically analysed. Thirdly, based on the distribution of these features, we developed a new predictor by using different optimal window sizes for different properties and a two-step feature selection method, which utilized the maximum relevance minimum redundancy method followed by a greedy feature selection procedure. The performance of Glypre was measured with a sensitivity of 57.47%, a specificity of 90.78%, an accuracy of 79.68%, area under the receiver-operating characteristic (ROC) curve (AUC) of 0.86, and a Matthews’s correlation coefficient (MCC) of 0.52 by 10-fold cross-validation. The detailed analysis results showed that our predictor may play a complementary role to other existing methods for identifying protein lysine glycation. The source code and datasets of the Glypre are available in the Supplementary File.

Highlights

  • Glycation is a post-translational modification produced by a reaction between reducing sugars and the amino groups of lysine or arginine, or N-terminal amino acids

  • We explored the application of some features in the lysine glycation prediction problem, and used a novel two-step feature selection, which was the maximum relevance minimum redundancy method followed by the greedy feature selection procedure (GFS), to remove the redundancy and contradiction among features to improve the prediction and generalizability of the model

  • 1, the investigation is performed for the distribution of different properties, including position conservation, secondary structure, Amino Acid Index database (AAindex), and the composition of k-spaced including position conservation, structure, AAindex, and the composition of k-spaced amino acid pairs on the basis of asecondary window size amino acid pairs on the basis of a window size 31

Read more

Summary

Introduction

Glycation is a post-translational modification produced by a reaction between reducing sugars and the amino groups of lysine or arginine, or N-terminal amino acids. The accumulation of glycation products are known to associate with the pathogenesis of aging and complications of diabetes. It plays crucial regulatory roles in almost all cellular processes and is involved in other human diseases, such as Alzheimer’s [1] and Parkinson’s diseases [2]. More interest has been paid to lysine glycation from researchers working on metabolism [5]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.