Abstract

DNA-binding proteins (DBPs) are highly concerned with several types of cancers (lung, breast, and liver), other fatal diseases (AIDS/HIV, asthma), and are used in the designing of drug. A series of predictors were constructed for identification of DBPs. However, a more accurate computational predictor is still essential for further performance improvement. In this work, a deep learning-based predictor (DBP-DeepCNN) is proposed for improving DBPs prediction. The salient features are derived by a novel method, namely, R-PSSM-DWT (Reduced position-specific scoring matrix-discrete wavelet transform) as well as Lead-BiPSSM (Lead-bigram-position specific scoring matrix), PSSM-DPC (Position specific scoring matrix-dipeptide composition), ED-PSSM (Evolutionary difference position specific scoring matrix), and F-PSSM (Filtered position specific scoring matrix). Further, the models are trained with 2D CNN (two-dimensional convolutional neural network), XGB (eXtreme gradient boosting), Adaboost, and ERT (extremely randomized trees). 2D CNN-based model with R-PSSM-DWT produced 6.92% and 1.32% higher accuracies than existing approach on training and independent datasets, respectively. These outcomes verified the superlative success rate of our novel predictor over the existing studies. In addition to being a promising method for large scale prediction of DBPs. DBP-DeepCNN would be fruitful for establishing more promising therapeutic strategies for chronic disease treatment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call