Abstract

We developed two CNNs for predicting ubiquitination sites in Arabidopsis thaliana, demonstrated their competitive performance, analyzed amino acid physicochemical properties and the CNN structures, and predicted ubiquitination sites in Arabidopsis. As an important posttranslational protein modification, ubiquitination plays critical roles in plant physiology, including plant growth and development, biotic and abiotic stress, metabolism, and so on. A lot of ubiquitination site prediction models have been developed for human, mouse and yeast. However, there are few models to predict ubiquitination sites for the plant Arabidopsis thaliana. Based on this context, we proposed two convolutional neural network (CNN) based models for predicting ubiquitination sites in A. thaliana. The two models reach AUC (area under the ROC curve) values of 0.924 and 0.913 respectively in five-fold cross-validation, and 0.921 and 0.914 respectively in independent test, which outperform other models and demonstrate the competitive edge of them. We in-depth analyze the amino acid physicochemical properties in the neighboring sequence regions of the ubiquitination sites, and study the influence of the CNN structure to the prediction performance. Potential ubiquitination sites in the global Arbidopsis proteome are predicted using the two CNN models. To facilitate the community, the source code, training and test dataset, predicted ubiquitination sites in the Arbidopsis proteome are available at GitHub ( http://github.com/nongdaxiaofeng/CNNAthUbi ) for interest users.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call