IACP-GE: accurate identification of anticancer peptides by using gradient boosting decision tree and extra tree

Y Liang,X Ma

doi:10.1080/1062936x.2022.2160011

Abstract

ABSTRACT Cancer is one of the main diseases threatening human life, accounting for millions of deaths around the world each year. Traditional physical and chemical methods for cancer treatment are extremely time-consuming, lab-intensive, expensive, inefficient and difficult to be applied in a high-throughput way. Hence, it is an urgent task to develop automated computational methods to enable fast and accurate identification of anticancer peptides (ACPs). In this paper, we develop a novel model named iACP-GE to identify ACPs. Multi-features are extracted by using binary encoding, enhanced grouped amino acid composition and BLOSUM62 encoding based on the N5C5 sequence, as well as detrended forward moving-average auto-cross correlation analysis based on physicochemical properties of 20 natural amino acids. Thus, 835 features are obtained for each sample, in order to avoid information redundancy, gradient boosting decision tree was adopted as the feature selection strategy. Then, the optimal feature subset is input to the extra tree classifier. The accuracies of ACP740 and ACP240 datasets with the 5-fold cross-validation were 90.54% and 91.25%, respectively. Experimental results indicate that iACP-GE significantly outperforms several existing models on ACP740 and ACP240 datasets and can be used as an effective tool for the identification of ACPs. The datasets and source codes for iACP-GE are available at https://github.com/yunyunliang88/iACP-GE.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

IACP-GE: accurate identification of anticancer peptides by using gradient boosting decision tree and extra tree

Abstract

Talk to us

Similar Papers

More From: SAR and QSAR in Environmental Research

Lead the way for us

Journal: SAR and QSAR in Environmental Research	Publication Date: Dec 25, 2022
Citations: 6

Similar Papers

ACP-GBDT: An improved anticancer peptide identification method with gradient boosting decision tree.
Yanjuan Li ... Di Ma
Frontiers in Genetics | VOL. 14
Yanjuan Li, et. al.Yanjuan Li ... Di Ma
29 Mar 2023
Frontiers in Genetics | VOL. 14

IRNA-ac4C: A novel computational method for effectively detecting N4-acetylcytidine sites in human mRNA
Wei Su ... Yan-Wen Li
International Journal of Biological Macromolecules | VOL. 227
Wei Su, et. al.Wei Su ... Yan-Wen Li
05 Dec 2022
International Journal of Biological Macromolecules | VOL. 227

I6mA-VC: A Multi-Classifier Voting Method for the Computational Identification of DNA N6-methyladenine Sites.
Tian Xue ... Shengli Zhang
Interdisciplinary sciences, computational life sciences | VOL. 13
Tian Xue, et. al.Tian Xue ... Shengli Zhang
08 Apr 2021
Interdisciplinary sciences, computational life sciences | VOL. 13

Exploring best-matched embedding model and classifier for charging-pile fault diagnosis
Wen Wang ... Jianhua Wang
Cybersecurity | VOL. 6
Wen Wang, et. al.Wen Wang ... Jianhua Wang
04 Apr 2023
Cybersecurity | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IACP-GE: accurate identification of anticancer peptides by using gradient boosting decision tree and extra tree

Abstract

Talk to us

Similar Papers

More From: SAR and QSAR in Environmental Research