Abstract

Computational methods for protein post-translational modification (PTM) site prediction provide a useful approach for studying protein functions. The prediction accuracy of the existing methods has significant room for improvement. A recent deep-learning architecture, Capsule Network (CapsNet), which can characterize the internal hierarchical representation of input data, presents a great opportunity to solve this problem, especially using small training data. We proposed a CapsNet for predicting protein PTM sites, including phosphorylation, N-linked glycosylation, N6-acetyllysine, methyl-arginine, S-palmitoyl-cysteine, pyrrolidone-carboxylic-acid and SUMOylation sites. The CapsNet outperformed the baseline convolutional neural network architecture MusiteDeep and other well-known tools in most cases and provided promising results for practical use, especially in learning from small training data. The capsule length also gives an accurate estimate for the confidence of the PTM prediction. We further demonstrated that the internal capsule features could be trained as a motif detector of phosphorylation sites when no kinase-specific phosphorylation labels were provided. In addition, CapsNet generates robust representations that have strong discriminant power in distinguishing kinase substrates from different kinase families. Our study sheds some light on the recognition mechanism of PTMs and applications of CapsNet on other bioinformatic problems. The codes are free to download from https://github.com/duolinwang/CapsNet_PTM. Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.