IDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

Shahana Yasmin Chowdhury,Swakkhar Shatabda,Abdollah Dehzangi

doi:10.1038/s41598-017-14945-1

Shahana Yasmin Chowdhury, Swakkhar Shatabda + Show 1 more

Open Access

https://doi.org/10.1038/s41598-017-14945-1

Copy DOI

Abstract

DNA-binding proteins play a very important role in the structural composition of the DNA. In addition, they regulate and effect various cellular processes like transcription, DNA replication, DNA recombination, repair and modification. The experimental methods used to identify DNA-binding proteins are expensive and time consuming and thus attracted researchers from computational field to address the problem. In this paper, we present iDNAProt-ES, a DNA-binding protein prediction method that utilizes both sequence based evolutionary and structure based features of proteins to identify their DNA-binding functionality. We used recursive feature elimination to extract an optimal set of features and train them using Support Vector Machine (SVM) with linear kernel to select the final model. Our proposed method significantly outperforms the existing state-of-the-art predictors on standard benchmark dataset. The accuracy of the predictor is 90.18% using jack knife test and 88.87% using 10-fold cross validation on the benchmark dataset. The accuracy of the predictor on the independent dataset is 80.64% which is also significantly better than the state-of-the-art methods. iDNAProt-ES is a novel prediction method that uses evolutionary and structural based features. We believe the superior performance of iDNAProt-ES will motivate the researchers to use this method to identify DNA-binding proteins. iDNAProt-ES is publicly available as a web server at: http://brl.uiu.ac.bd/iDNAProt-ES/.

Highlights

Computational methods that have been used to predict the DNA-binding proteins can be broadly categorized into two groups: structure based methods[11,12] and sequence based methods[13,14,15,16,17,18,19]
We compare the results achieved by iDNAProt-ES with previous state-of-the-art methods found in the literature including: DNABinder[28], DNA-Prot[25], iDNA-Prot[26], iDNA-Prot|dis[13], DBPPred[15], iDNAPro-PseAAC14, PseDNA-Pro[29], Kmer1 + ACC30 and Local-DPP16
We present iDNAProt-ES, a novel prediction method for identification of DNA-binding proteins

Summary

Introduction

Computational methods that have been used to predict the DNA-binding proteins can be broadly categorized into two groups: structure based methods[11,12] and sequence based methods[13,14,15,16,17,18,19]. DNA-Prot is another software proposed in[25] They used amino acid composition, physio-chemical properties and secondary structure information as features and trained their model using a Random Forest classifier. Amino acid distance-pair coupling information and the amino acid reduced alphabet profile was incorporated into the general form of pseudo amino acid composition[31] by Liu et al.[13] They offered a freely available web-server called iDNA-Prot|dis. They used a wrapper based best first feature selection technique to select optimal set of features They used features based on amino acid composition, PSSM scores, secondary structures and relative solvent accessibility and trained their model using Random Forest and Gaussian Naive Bayesian classifiers. They used profile-based protein representation and selected a set of 23 optimal features using Linear Discriminant Analysis (LDA) Their model was trained using Support Vector Machine (SVM) classifier. Among other recent works are SVM-PSSM-DT32, PNImodeler[33], CNNsite[34], BindUP35, etc

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Nov 2, 2017
Citations: 82	License type: open-access

R Discovery Prime

R Discovery Prime

IDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Identification of DNA-Binding Proteins Using Mixed Feature Representation Methods.
Kaiyang Qu ... Ke Han
Molecules | VOL. 22
Kaiyang Qu, et. al.Kaiyang Qu ... Ke Han
22 Sep 2017
Molecules | VOL. 22

Identification of single-stranded and double-stranded DNA binding proteins based on protein structure.
Wei Wang ... Xionghui Zhou
BMC bioinformatics | VOL. Suppl 15 12
Wei Wang, et. al.Wei Wang ... Xionghui Zhou
06 Nov 2014
BMC bioinformatics | VOL. Suppl 15 12

Identification of DNA binding proteins using evolutionary profiles position specific scoring matrix
Muhammad Waris ... Maqsood Hayat
Neurocomputing | VOL. 199
Muhammad Waris, et. al.Muhammad Waris ... Maqsood Hayat
06 Apr 2016
Neurocomputing | VOL. 199

DNA-Prot: Identification of DNA Binding Proteins from Protein Sequence Information using Random Forest
K Krishna Kumar ... P N Suganthan
Journal of Biomolecular Structure and Dynamics | VOL. 26
K Krishna Kumar, et. al.K Krishna Kumar ... P N Suganthan
01 Jun 2009
Journal of Biomolecular Structure and Dynamics | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

IDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports