Abstract

Transcriptional regulation requires the binding of transcription factors (TFs) to short sequence-specific DNA motifs, usually located at the gene regulatory regions. Interestingly, based on a vast amount of data accumulated from genomic assays, it has been shown that only a small fraction of all potential binding sites containing the consensus motif of a given TF actually bind the protein. Recent in vitro binding assays, which exclude the effects of the cellular environment, also demonstrate selective TF binding. An intriguing conjecture is that the surroundings of cognate binding sites have unique characteristics that distinguish them from other sequences containing a similar motif that are not bound by the TF. To test this hypothesis, we conducted a comprehensive analysis of the sequence and DNA shape features surrounding the core-binding sites of 239 and 56 TFs extracted from in vitro HT-SELEX binding assays and in vivo ChIP-seq data, respectively. Comparing the nucleotide content of the regions around the TF-bound sites to the counterpart unbound regions containing the same consensus motifs revealed significant differences that extend far beyond the core-binding site. Specifically, the environment of the bound motifs demonstrated unique sequence compositions, DNA shape features, and overall high similarity to the core-binding motif. Notably, the regions around the binding sites of TFs that belong to the same TF families exhibited similar features, with high agreement between the in vitro and in vivo data sets. We propose that these unique features assist in guiding TFs to their cognate binding sites.

Highlights

  • Transcriptional regulation is highly dependent on the binding of transcription factors (TFs) to short DNA binding motifs (Matys et al 2003; Bryne et al 2008)

  • We found that the majority of TFs show differences in the GC composition surrounding their binding motifs (Supplemental Figs. 1, 2, 3A; for a comparison of each nucleotide separately, see Supplemental Fig. 4A), with a difference of up to 16% in GC content between the bound and unbound sequences (3.4% on average) (Fig. 2B)

  • When we clustered TFs based on their Pfam binding domain (Finn et al 2014), we found that TFs belonging to evolutionary related domains often have similar environmental preferences (Fig. 2A)

Read more

Summary

Introduction

Transcriptional regulation is highly dependent on the binding of transcription factors (TFs) to short DNA binding motifs (Matys et al 2003; Bryne et al 2008). Selective binding of motifs by TFs has been observed in a variety of in vitro experiments (Noyes et al 2008; Badis et al 2009; Berger and Bulyk 2009; Zhao et al 2009; Slattery et al 2011; Enuameh et al 2013; Gordân et al 2013; Jolma et al 2013; Afek et al 2014; Weirauch et al 2014; Abe et al 2015; Levo et al 2015) These in vitro studies show that TFs can bind to different sequences containing a similar motif with a large range of different affinities, which suggests that TF-DNA binding specificity is influenced by the DNA context surrounding the motif. We propose that the sequence environment around the consensus motif may help in guiding the TFs to their cognate binding sites

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call