Abstract
BackgroundProtein S-sulfenylation is a type of post-translational modification (PTM) involving the covalent binding of a hydroxyl group to the thiol of a cysteine amino acid. Recent evidence has shown the importance of S-sulfenylation in various biological processes, including transcriptional regulation, apoptosis and cytokine signaling. Determining the specific sites of S-sulfenylation is fundamental to understanding the structures and functions of S-sulfenylated proteins. However, the current lack of reliable tools often limits researchers to use expensive and time-consuming laboratory techniques for the identification of S-sulfenylation sites. Thus, we were motivated to develop a bioinformatics method for investigating S-sulfenylation sites based on amino acid compositions and physicochemical properties.ResultsIn this work, physicochemical properties were utilized not only to identify S-sulfenylation sites from 1,096 experimentally verified S-sulfenylated proteins, but also to compare the effectiveness of prediction with other characteristics such as amino acid composition (AAC), amino acid pair composition (AAPC), solvent-accessible surface area (ASA), amino acid substitution matrix (BLOSUM62), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM). Various prediction models were built using support vector machine (SVM) and evaluated by five-fold cross-validation. The model constructed from hybrid features, including PSSM and physicochemical properties, yielded the best performance with sensitivity, specificity, accuracy and MCC measurements of 0.746, 0.737, 0.738 and 0.337, respectively. The selected model also provided a promising accuracy (0.693) on an independent testing dataset. Additionally, we employed TwoSampleLogo to help discover the difference of amino acid composition among S-sulfenylation, S-glutathionylation and S-nitrosylation sites.ConclusionThis work proposed a computational method to explore informative features and functions for protein S-sulfenylation. Evaluation by five-fold cross validation indicated that the selected features were effective in the identification of S-sulfenylation sites. Moreover, the independent testing results demonstrated that the proposed method could provide a feasible means for conducting preliminary analyses of protein S-sulfenylation. We also anticipate that the uncovered differences in amino acid composition may facilitate future studies of the extensive crosstalk among S-sulfenylation, S-glutathionylation and S-nitrosylation.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2299-1) contains supplementary material, which is available to authorized users.
Highlights
Post-translational modification (PTM) at the cysteine residues is essential to the dynamic functions of proteins
For non-S-sulfenylated sites, there was an abundance of neutral amino acids, including leucine (L), cysteine (C), histidine (H), methionine (M), phenylalanine (F) and tyrosine (Y), at positions ranging from −9 to +7, while arginine (R) residue seemed to be concentrated at three positions (−1, 1 and 2) around non-S-sulfenylation sites
The analytical results indicated that the positions of amino acids relative to one another in the sequence play a vital role in discriminating between Ssulfenylation and non-S-sulfenylation sites
Summary
Post-translational modification (PTM) at the cysteine residues is essential to the dynamic functions of proteins. S-sulfenylation is a reversible covalent modification on the thiol group of cysteine residues by hydrogen peroxide, whereby the Cys-SH is oxidized to sulfenic (Cys-SOH), sulfinic (Cys–SO2H) and sulfonic acids (Cys–SO3H). These changes contribute strongly to the regulation of protein function under both normal and oxidative stress conditions [1,2,3,4]. Several chemoproteomic approaches have been developed for identifying specific sites in proteins that undergo S-sulfenylation [6,7,8,9,10] These experimental methods are often expensive and timeconsuming. We were motivated to develop a bioinformatics method for investigating S-sulfenylation sites based on amino acid compositions and physicochemical properties
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have