Learning probabilistic protein-DNA recognition codes from DNA-binding specificities using structural mappings.

Joshua L Wetzel,Kaiqian Zhang,Mona Singh

doi:10.1101/gr.276606.122

Abstract

Knowledge of how proteins interact with DNA is essential for understanding gene regulation. Although DNA-binding specificities for thousands of transcription factors (TFs) have been determined, the specific amino acid–base interactions comprising their structural interfaces are largely unknown. This lack of resolution hampers attempts to leverage these data in order to predict specificities for uncharacterized TFs or TFs mutated in disease. Here we introduce recognition code learning via automated mapping of protein–DNA structural interfaces (rCLAMPS), a probabilistic approach that uses DNA-binding specificities for TFs from the same structural family to simultaneously infer both which nucleotide positions are contacted by particular amino acids within the TF as well as a recognition code that relates each base-contacting amino acid to nucleotide preferences at the DNA positions it contacts. We apply rCLAMPS to homeodomains, the second largest family of TFs in metazoans and show that it learns a highly effective recognition code that can predict de novo DNA-binding specificities for TFs. Furthermore, we show that the inferred amino acid–nucleotide contacts reveal whether and how nucleotide preferences at individual binding site positions are altered by mutations within TFs. Our approach is an important step toward automatically uncovering the determinants of protein–DNA specificity from large compendia of DNA-binding specificities and inferring the altered functionalities of TFs mutated in disease.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome research	Publication Date: Sep 1, 2022
Citations: 10	License type: cc-by-nc

R Discovery Prime

R Discovery Prime

Learning probabilistic protein-DNA recognition codes from DNA-binding specificities using structural mappings.

Abstract

Talk to us

Similar Papers

More From: Genome research

Lead the way for us

Similar Papers

MADS specificity : Unravelling the dual function of the MADS domain protein FRUITFULL
Hilda Van Mourik
-
Hilda Van MourikHilda Van Mourik
10 Nov 2019
10 Nov 2019

Variation in Homeodomain DNA Binding Revealed by High-Resolution Analysis of Sequence Preferences
Michael F Berger ... Timothy R Hughes
Cell | VOL. 133
Michael F Berger, et. al.Michael F Berger ... Timothy R Hughes
01 Jun 2008
Cell | VOL. 133

Transcription Factors and DNA Play Hide and Seek.
David M Suter
Trends in Cell Biology | VOL. 30
David M SuterDavid M Suter
07 Apr 2020
Trends in Cell Biology | VOL. 30

A deterministic code for transcription factor-DNA recognition through computation of binding interfaces.
Marco Trerotola ... Saverio Alberti
NAR Genomics and Bioinformatics | VOL. 4
Marco Trerotola, et. al.Marco Trerotola ... Saverio Alberti
13 Jan 2022
NAR Genomics and Bioinformatics | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning probabilistic protein-DNA recognition codes from DNA-binding specificities using structural mappings.

Abstract

Talk to us

Similar Papers

More From: Genome research