Abstract

BackgroundCarbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson’s disease, and Alzheimer’s disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures.ResultsBy manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing.ConclusionThis study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/) and are also anticipated to facilitate the study of large-scale carbonylated proteomes.

Highlights

  • Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins

  • Investigation of amino acid composition at carbonylated sites To study the composition of amino acids around carbonylated sites, a graphical representation was prepared by calculating the occurrence of each amino acid surrounding the carbonylation sites and divided by the length of the fragment excluded at the carbonylation site

  • In the prediction of K carbonylation sites, the support vector machine (SVM) models trained with Amino acid composition (AAC) and with positional weighted matrix (PWM) yield the best performance with an accuracy of 0.69, Matthews correlations coefficient (MCC) value of 0.37, and area under ROC curve (AUC) of 0.78

Read more

Summary

Introduction

Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. Several types of PTMs were reported that occur in a non-catalyzed manner, and are often influenced out by amino acid composition, structural environment, and physicochemical properties of proteins. These kinds of PTMs are known as non-enzymatic protein modifications, such as oxidation, S-nitrosylation, glutathionylation, carbonylation, isomerization, sulfenylation, deamidation, and glycation [4, 5]. Oxidative stress occurs due to the abundance of ROS and the carbonylation of proteins is an irreversible PTM that has been regarded as a biomarker for oxidative stress based on its relative stability and ease of quantification [7, 8]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call