Abstract
With the in-depth study of posttranslational modification sites, protein ubiquitination has become the key problem to study the molecular mechanism of posttranslational modification. Pupylation is a widely used process in which a prokaryotic ubiquitin-like protein (Pup) is attached to a substrate through a series of biochemical reactions. However, the experimental methods of identifying pupylation sites is often time-consuming and laborious. This study aims to propose an improved approach for predicting pupylation sites. Firstly, the Pearson correlation coefficient was used to reflect the correlation among different amino acid pairs calculated by the frequency of each amino acid. Then according to a descending ranked order, the multiple types of features were filtered separately by values of Pearson correlation coefficient. Thirdly, to get a qualified balanced dataset, the K-means principal component analysis (KPCA) oversampling technique was employed to synthesize new positive samples and Fuzzy undersampling method was employed to reduce the number of negative samples. Finally, the performance of our method was verified by means of jackknife and a 10-fold cross-validation test. The average results of 10-fold cross-validation showed that the sensitivity (Sn) was 90.53%, specificity (Sp) was 99.8%, accuracy (Acc) was 95.09%, and Matthews Correlation Coefficient (MCC) was 0.91. Moreover, an independent test dataset was used to further measure its performance, and the prediction results achieved the Acc of 83.75%, MCC of 0.49, which was superior to previous predictors. The better performance and stability of our proposed method showed it is an effective way to predict pupylation sites.
Highlights
Pupylation is a prokaryotic analog of ubiquitination whose prokaryotic ubiquitin-like protein (Pup) separates intracellular proteins under the action of enzymes and modifies the target protein [1,2]
The Pearson Correlation Coefficient was employed to calculate the correlation coefficient among amino acids based on amino acid composition (AAC) values, and its values were sorted in a descending order
The Pearson correlation coefficient was used to evaluate the relevance between any two amino acids, which were employed to obtain an optimal combination of amino acid pairs
Summary
Pupylation is a prokaryotic analog of ubiquitination whose prokaryotic ubiquitin-like protein (Pup) separates intracellular proteins under the action of enzymes and modifies the target protein [1,2]. Pup is an identified posttranslational small modifier in prokaryotes [1,2] that usually attaches to the substrate lysine via isopeptide bonds, and this process is called pupylation. Since experimental methods are laborious, it is essential to improve the current computational methodologies to provide direction for further research on intriguing research questions. An example of this is the works on understanding the stability of ERCC1 DNA repair protein, a biomarker of several advanced
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.