Abstract

BackgroundThe specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence.ResultsHere we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models.ConclusionOur results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition.

Highlights

  • The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression

  • We evaluated the performance of our algorithm in a set of 4 bacterial and 4 eukaryotic TFs which have been co-crystallized bound to DNA, and in most cases our results proved to be as good as or better than those obtained with the structure-based cumulative contact method by Morozov and Siggia [32]

  • Two TF homology models were analyzed in detail and used to predict their DNA binding motif, after sampling sidechain rotamers at their contact interfaces. In this case the results we obtained were significantly better than those returned by the reference method, which indicates that our algorithm could be suitably used to study TFs of unknown structure starting from structural models

Read more

Summary

Introduction

The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. The specific recognition of genomic cis-regulatory elements by nucleic acid binding proteins is of critical importance for many vital processes such as DNA replication and repair, mRNA translation and transcriptional regulation. The probabilistic models more commonly used in such computational approaches are position weight matrices (PWMs) obtained from multiple alignments of known binding sites. This approach is limited to TFs with a sufficient number of experimentally identified binding sites, for which reliable statistical models may be built. An alternative approach would be to predict DNA operator sites which are compatible with the mode of binding of a given

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.