Abstract

To improve the prediction of DNA-binding Proteins (DBPs), this paper presents a deep learning-based method, named DBP-CNN. To efficiently extract the important features, we design a novel feature descriptor namely position-specific scoring matrix-tetra slices-discrete cosine transform (PSSM-TS-DCT). PSSM-TS-DCT explores the local features using tetra-slices notion with PSSM and captures decisive information by a compression scheme called DCT. The conventional feature descriptors such as DDE (dipeptide deviation from expected mean) and BiPSSM (bigram position-specific scoring matrix) are also used for feature extraction. The feature vectors of these feature descriptors are provided to RF (random forest), ERT (extremely randomized trees), XGB (eXtreme gradient boosting), and 2D CNN (two-dimensional convolutional neural network) classifiers. Our proposed feature descriptor (PSSM-TS-DCT) performs better than DDE and BiPSSM on all four classifiers. Similarly, among all classifiers, 2D CNN with PSSM-TS-DCT attains 2.80% and 0.92% higher accuracies than the recent predictor on both training and independent datasets, respectively. The experimental results show that our novel method (DBP-CNN) can predict DBPs more accurately than existing predictors in the literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.