DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Farman Ali,Zar Nawab Khan Swati,Shahid Akbar,Saeed Ahmed

doi:10.1007/s10822-019-00207-x

Abstract

DNA-binding proteins (DBPs) participate in various biological processes including DNA replication, recombination, and repair. In the human genome, about 6-7% of these proteins are utilized for genes encoding. DBPs shape the DNA into a compact structure known chromatin while some of these proteins regulate the chromosome packaging and transcription process. In the pharmaceutical industry, DBPs are used as a key component of antibiotics, steroids, and cancer drugs. These proteins also involve in biophysical, biological, and biochemical studies of DNA. Due to the crucial role in various biological activities, identification of DBPs is a hot issue in protein science. A series of experimental and computational methods have been proposed, however, some methods didn't achieve the desired results while some are inadequate in its accuracy and authenticity. Still, it is highly desired to present more intelligent computational predictors. In this work, we introduce an innovative computational method namely DP-BINDER based on physicochemical and evolutionary information. We captured local highly decisive features from physicochemical properties of primary protein sequences via normalized Moreau-Broto autocorrelation (NMBAC) and evolutionary information by position specific scoring matrix-transition probability composition (PSSM-TPC) and pseudo-position specific scoring matrix (PsePSSM) using training and independent datasets. The optimal features were selected by the support vector machine-recursive feature elimination and correlation bias reduction (SVM-RFE + CBR) from fused features and were fed into random forest (RF) and support vector machine (SVM). Our method attained 92.46% and 89.58% accuracy with jackknife and ten-fold cross-validation, respectively on the training dataset, while 81.17% accuracy on the independent dataset for prediction of DBPs. These results demonstrate that our method attained the highest success rate in the literature. The superiority of DP-BINDER over existing approaches due to several reasons including abstraction of local dominant features via effective feature descriptors, utilization of appropriate feature selection algorithms and effective classifier.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Abstract

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design

Lead the way for us

Journal: Journal of Computer-Aided Molecular Design	Publication Date: May 23, 2019
Citations: 62

Similar Papers

Identification of DNA-binding proteins using support vector machines and evolutionary profiles.
Manish Kumar ... Michael M Gromiha
BMC Bioinformatics | VOL. 8
Manish Kumar, et. al.Manish Kumar ... Michael M Gromiha
27 Nov 2007
BMC Bioinformatics | VOL. 8

Combining physicochemical and evolutionary information for protein contact prediction.
Michael Schneider ... Oliver Brock
PLoS ONE | VOL. 9
Michael Schneider, et. al.Michael Schneider ... Oliver Brock
22 Oct 2014
PLoS ONE | VOL. 9

Feature selection and analysis on correlated gas sensor data with recursive feature elimination
Ke Yan ... David Zhang
Sensors and Actuators B: Chemical | VOL. 212
Ke Yan, et. al.Ke Yan ... David Zhang
16 Feb 2015
Sensors and Actuators B: Chemical | VOL. 212

TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.
Jun Hu ... Yi-Heng Zhu
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 17
Jun Hu, et. al.Jun Hu ... Yi-Heng Zhu
18 Jan 2019
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

Abstract

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design