Abstract

BackgroundThe process of protein-DNA binding has an essential role in the biological processing of genetic information. We use relational machine learning to predict DNA-binding propensity of proteins from their structures. Automatically discovered structural features are able to capture some characteristic spatial configurations of amino acids in proteins.ResultsPrediction based only on structural relational features already achieves competitive results to existing methods based on physicochemical properties on several protein datasets. Predictive performance is further improved when structural features are combined with physicochemical features. Moreover, the structural features provide some insights not revealed by physicochemical features. Our method is able to detect common spatial substructures. We demonstrate this in experiments with zinc finger proteins.ConclusionsWe introduced a novel approach for DNA-binding propensity prediction using relational machine learning which could potentially be used also for protein function prediction in general.

Highlights

  • The process of protein-DNA binding has an essential role in the biological processing of genetic information

  • We compared classifiers based on structural patterns discovered by our method (SF) with classifiers based on 10 physicochemical features (PF) identified as most predictive by Szilagyi and Skolnick’s method [8]

  • We trained classifiers based on both structural features and physicochemical features (PSF)

Read more

Summary

Introduction

The process of protein-DNA binding has an essential role in the biological processing of genetic information. DNA-binding proteins have a vital role in the biological processing of genetic information like DNA transcription, replication, maintenance and the regulation of gene expression. In this paper we are interested in prediction of DNA-binding propensity of proteins using their structural information and physicochemical properties. This approach is in contrast with some of the most recent methods which are based on similarity of proteins, for example structural alignment or threading-based methods [1,2,3] or methods exploiting information about evolutionary conservation of amino acids in proteins [4]. Methods exploiting evolutionary information can be more accurate than the approaches aiming to infer binding propensity purely

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call