Abstract

Malonylation is a new protein post-translational modification and regulates a variety of cellular physiological processes. However, it is costly and time-consuming to identify malonylation sites through traditional experiments. Therefore, the prediction of malonylation sites by computational methods plays an important role in experimental design. In this paper, a new prediction model of malonylation sites, Malsite-Deep, is proposed. First, the seven feature extraction methods are used to extract feature information of protein sequences. Then, the under-sampling NearMiss-2 method is applied to handle imbalance data, and the update gate and reset gate of gated recurrent units (GRU) are used to select the optimal feature subset. Finally, the data from GRU layer is input into deep neural networks (DNN) to predict the malonylation sites, and the model performance is evaluated by 10-fold cross-validation and independent test sets. The 10-fold cross-validation shows that the AUC value on the training dataset reaches 0.99. The AUC values on the four independent test datasets all reach above 0.95. Results suggest that Malsite-Deep presented here facilitates the identification of protein malonylation sites.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call