Abstract

Anticancer peptides (ACPs) have been widely applied in the treatment of cancer owing to good safety, rational side effects, and high selectivity. However, the number of ACPs that have been experimentally validated is limited as identification of ACPs is extremely expensive. Hence, accurate and cost-effective identification methods for ACPs are urgently needed. In this work, we proposed a deep learning-based model, named iACP-DFSRA, for ACPs identification. Specifically, we adopted two kinds of sequence embedding technologies, ProtBert_BFD pre-training language model and handcrafted features to encode protein sequences. Then, the LightGBM was used for feature selection, and the selected features were input into ResCNN and Attention mechanism, respectively, to extract local and global features. Finally, the concatenate features were deeply fused by using the Attention mechanism to allow key features to be paid more attention to by the model and make predictions by fully connected layer. The results of 10-fold cross-validation demonstrated that the iACP-DFSRA model delivered improved results in most metrics with Sp of 94.15%, Sn of 95.32%, Acc of 94.74% and MCC of 89.48% compared to the latest AACFlow model. Indeed, the iACP-DFSRA model is the only model with Acc > 90% and MCC > 80% on this independent test dataset. Furthermore, we have further demonstrated the superiority of our model on additional datasets. In addition, t-SNE and SHAP interpretation analysis demonstrated that it is crucial to use two channels for feature extraction and use the Attention mechanism for deep fusion, which helps the iACP-DFSRA to predict ACPs more effectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.