Combining Biochemical Features and Evolutionary Information for Predicting DNA-Binding Residues in Protein Sequences

Liangjiang Wang

doi:10.1007/978-3-642-10238-7_15

Abstract

This paper describes a new machine learning approach for prediction of DNA-binding residues from protein sequence data. Several biologically relevant features, including biochemical properties of amino acid residues and evolutionary information of protein sequences, were selected for input encoding. The evolutionary information was represented as position-specific scoring matrices (PSSMs) and several new descriptors developed in this study. The sequence-derived features were then used to train random forests (RFs), which could handle a large number of input variables and avoid model overfitting. The use of evolutionary information together with biochemical features was found to significantly improve classifier performance. The RF classifier was further evaluated using a separate test dataset. The results suggest that the RF-based approach gives rise to more accurate prediction of DNA-binding residues than previous studies.KeywordsDNA-binding site predictionfeature extractionevolutionary informationrandom forestsmachine learning

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Combining Biochemical Features and Evolutionary Information for Predicting DNA-Binding Residues in Protein Sequences

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Random Forests for Prediction of DNA-Binding Residues in Protein Sequences Using Evolutionary Information
Liangjiang Wang
-
Liangjiang WangLiangjiang Wang
01 Dec 2008
01 Dec 2008

Prediction of DNA-binding residues from protein sequence information using random forests
Liangjiang Wang ... Jack Y Yang
BMC Genomics | VOL. 10
Liangjiang Wang, et. al.Liangjiang Wang ... Jack Y Yang
01 Jan 2009
BMC Genomics | VOL. 10

Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information
Xin Ma ... Xiao Sun
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9
Xin Ma, et. al.Xin Ma ... Xiao Sun
01 Nov 2012
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9

PiRaNhA: a server for the computational prediction of RNA-binding residues in protein sequences
Y Murakami ... S Jones
Nucleic Acids Research | VOL. 38
Y Murakami, et. al.Y Murakami ... S Jones
27 May 2010
Nucleic Acids Research | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining Biochemical Features and Evolutionary Information for Predicting DNA-Binding Residues in Protein Sequences

Abstract

Talk to us

Similar Papers