Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

Huiying Zhao,Yuedong Yang,Yaoqi Zhou,Jihua Wang

doi:10.1371/journal.pone.0096694

Abstract

As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions). A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC) of 0.77 with high precision (94%) and high sensitivity (65%). We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA)] is available as an on-line server at http://sparks-lab.org.

Highlights

The completion of thousands of proteome projects has led to an explosive increase in number of proteins with unknown functions
A mediumresolution function prediction is to predict the region in a protein that binds with DNA (DNA-binding residues or DNA-binding interface regions)
The method achieved a Matthews correlation coefficient (MCC) value of 0.77 that is higher than the best structure-based technique (DDNA3O)

Summary

Introduction

The completion of thousands of proteome projects has led to an explosive increase in number of proteins with unknown functions. The comprehensive Uniprot database [1] contains 107 protein sequences and, yet, less than 5% of these sequences have annotated functions from Gene Ontology Annotation database [2]. This gap between the number of sequences and the number of sequences with annotations is widening rapidly as inexpensive and more efficient generation sequencing techniques become available. Function prediction of DNA-binding can be classified into three levels of resolution (low, medium and high). A low-resolution function prediction is a simple two-state prediction whether or not a protein binds to DNA. A high-resolution function prediction is to predict the complex structure between DNA and a target protein of unknown function

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: May 2, 2014
Citations: 85	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information
Xin Ma ... Hong-De Liu
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9
Xin Ma, et. al.Xin Ma ... Hong-De Liu
01 Nov 2012
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 9

On the Accuracy of Sequence-Based Computational Inference of Protein Residues Involved in Interactions with DNA
Igor B Kuznetsov ... Zhenkun Gou
Trends in Applied Sciences Research | VOL. 3
Igor B Kuznetsov , et. al.Igor B Kuznetsov ... Zhenkun Gou
01 Apr 2008
Trends in Applied Sciences Research | VOL. 3

Predicting DNA-Binding Residues of Proteins Using Random Forest and Evolutionary Information Combined with Conservation Information
Xin Ma ... Jing Guo
-
Xin Ma, et. al.Xin Ma ... Jing Guo
01 May 2011
01 May 2011

Protein-DNA complex structure modeling based on structural template
Juan Xie ... Shiyong Liu
Biochemical and Biophysical Research Communications | VOL. 577
Juan Xie, et. al.Juan Xie ... Shiyong Liu
08 Sep 2021
Biochemical and Biophysical Research Communications | VOL. 577

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE