Abstract

Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. However, most of these tools predict succinylation sites based on traditional machine learning methods. Hence, this work aimed to carry out the succinylation site prediction based on a deep learning model. The abundance of MS-verified succinylated peptides enabled the investigation of substrate site specificity of succinylation sites through sequence-based attributes, such as position-specific amino acid composition, the composition of k-spaced amino acid pairs (CKSAAP), and position-specific scoring matrix (PSSM). Additionally, the maximal dependence decomposition (MDD) was adopted to detect the substrate signatures of lysine succinylation sites by dividing all succinylated sequences into several groups with conserved substrate motifs. According to the results of ten-fold cross-validation, the deep learning model trained using PSSM and informative CKSAAP attributes can reach the best predictive performance and also perform better than traditional machine-learning methods. Moreover, an independent testing dataset that truly did not exist in the training dataset was used to compare the proposed method with six existing prediction tools. The testing dataset comprised of 218 positive and 2621 negative instances, and the proposed model could yield a promising performance with 84.40% sensitivity, 86.99% specificity, 86.79% accuracy, and an MCC value of 0.489. Finally, the proposed method has been implemented as a web-based prediction tool (CNN-SuccSite), which is now freely accessible at http://csb.cse.yzu.edu.tw/CNN-SuccSite/.

Highlights

  • Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes

  • The amino acid composition (AAC) was a feasible scheme to explore the potential motif of conserved residues around the succinylation sites based on the fragments with 31-mer sequence length

  • Due to the abundance of experimentally verified succinylation data obtained from public resources, we were motivated to develop a new method to predict protein succinylation sites based on a deep learning strategy

Read more

Summary

Introduction

Succinylation is a type of protein post-translational modification (PTM), which can play important roles in a variety of cellular processes. Due to an increasing number of site-specific succinylated peptides obtained from high-throughput mass spectrometry (MS), various tools have been developed for computationally identifying succinylated sites on proteins. Most of these tools predict succinylation sites based on traditional machine learning methods. High-throughput mass spectrometry (MS) has been widely adopted to identify large-scale datasets of site-specific succinylation peptides[5,6,7,8]. Due to the quantitative succinylome data obtained from MS-based proteomics techniques, a variety of bioinformatics tools have been developed for predicting lysine succinylation sites based on protein sequences. A list of previously proposed approaches concerning computational annotation of succinylated sites is given in

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call