Abstract

Cysteine S-sulfenylation is an important protein post-translational modification, which plays a crucial role in transcriptional regulation, cell signaling, and protein functions. To better elucidate the molecular mechanism of S-sulfenylation, it is important to identify S-sulfenylated substrates and their corresponding S-sulfenylation sites accurately. In this study, a novel bioinformatics tool named Sulf_FSVM is proposed to predict S-sulfenylation sites by using multiple feature extraction and fuzzy support vector machine algorithm. On the one hand, amino acid factors, binary encoding, and the composition of k-spaced amino acid pairs features are incorporated to encode S-sulfenylation sites. And the maximum relevance minimum redundancy method are adopted to remove the redundant features. On the other hand, a fuzzy support vector machine algorithm is used to handle the class imbalance and noise problem in S-sulfenylation sites training dataset. As illustrated by 10-fold cross-validation, the performance of Sulf_FSVM achieves a satisfactory performance with a Sensitivity of 73.26%, a Specificity of 70.78%, an Accuracy of 71.07% and a Matthew's correlation coefficient of 0.2971. Independent tests also show that Sulf_FSVM significantly outperforms existing S-sulfenylation sites predictors. Therefore, Sulf_FSVM can be a useful tool for accurate prediction of protein S-sulfenylation sites.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call