TDFFM: Transformer and Deep Forest Fusion Model for Predicting Coronavirus 3C-Like Protease Cleavage Sites.

Qingsong Wang,Ruiquan Ge,Changmiao Wang,Ahmed Elazab,Qiming Fang,Renfeng Zhang

doi:10.1109/tcbb.2024.3378470

Abstract

COVID-19, caused by the highly contagious SARS-CoV-2 virus, is distinguished by its positive-sense, single-stranded RNA genome. A thorough understanding of SARS-CoV-2 pathogenesis is crucial for halting its proliferation. Notably, the 3C-like protease of the coronavirus (denoted as 3CLpro) is instrumental in the viral replication process. Precise delineation of 3CLpro cleavage sites is imperative for elucidating the transmission dynamics of SARS-CoV-2. While machine learning tools have been deployed to identify potential 3CLpro cleavage sites, these existing methods often fall short in terms of accuracy. To improve the performances of these predictions, we propose a novel analytical framework, the Transformer and Deep Forest Fusion Model (TDFFM). Within TDFFM, we utilize the AAindex and the BLOSUM62 matrix to encode protein sequences. These encoded features are subsequently input into two distinct components: a Deep Forest, which is an effective decision tree ensemble methodology, and a Transformer equipped with a Multi-Level Attention Model (TMLAM). The integration of the attention mechanism allows our model to more accurately identify positive samples, thus enhancing the overall predictive performance. Evaluation on a test set demonstrates that our TDFFM achieves an accuracy of 0.955, an AUC of 0.980, and an F1-score of 0.367, substantiating the model's superior prediction capabilities.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TDFFM: Transformer and Deep Forest Fusion Model for Predicting Coronavirus 3C-Like Protease Cleavage Sites.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM transactions on computational biology and bioinformatics

Lead the way for us

Similar Papers

Proteolytic Cleavage of Bovine Adenovirus 3-Encoded pVIII.
Amit Gaba ... Niraj Makadiya
Journal of virology | VOL. 91
Amit Gaba, et. al.Amit Gaba ... Niraj Makadiya
28 Apr 2017
Journal of virology | VOL. 91

The Arterivirus Nsp4 Protease Is the Prototype of a Novel Group of Chymotrypsin-like Enzymes, the 3C-like Serine Proteases
Eric J Snijder ... Alexander E Gorbalenya
Journal of Biological Chemistry | VOL. 271
Eric J Snijder, et. al.Eric J Snijder ... Alexander E Gorbalenya
01 Mar 1996
Journal of Biological Chemistry | VOL. 271

Site-directed Mutagenesis, Proteolytic Cleavage, and Activation of Human Proheparanase
Ghada Abboud-Jarrous ... Israel Vlodavsky
Journal of Biological Chemistry | VOL. 280
Ghada Abboud-Jarrous, et. al.Ghada Abboud-Jarrous ... Israel Vlodavsky
01 Apr 2005
Journal of Biological Chemistry | VOL. 280

Prediction of coronavirus 3C-like protease cleavage sites using machine-learning algorithms.
Huiting Chen ... Yousong Peng
Virologica Sinica | VOL. 37
Huiting Chen, et. al.Huiting Chen ... Yousong Peng
02 May 2022
Virologica Sinica | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TDFFM: Transformer and Deep Forest Fusion Model for Predicting Coronavirus 3C-Like Protease Cleavage Sites.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM transactions on computational biology and bioinformatics