Combination of the nucleotides A, C, G, and T leads to one-dimensional (1D) DNA sequence that further gives rise to the three-dimensional (3D) structure of DNA. In fact, proteins do not literally read the letters A, C, G, and T. Instead, read out is based on spatial interactions between 3D objects, in which the structures of both DNA and proteins are important for the interactions to occur (Rohs et al. 2009; Rohs et al. 2010). On the other hand, while the 3D structure of DNA is dependent on the primary sequence, there is degeneracy in mapping DNA sequence to shape. Moreover, while high-throughput sequencing technologies continue to produce large amounts of DNA sequence information, experimental data on 3D structures of DNA is limited. These facts all urge to develop new approaches that provide new insights into DNA structure. Given such motivation, we have developed a high-throughput tool for predicting shape of “naked” DNA on a genomic scale (Slattery et al. 2011) and a web server for predicting DNA shape information. In parallel, we constructed a database of shape that features of transcription factor binding sites using sequence-based databases as the source for binding motifs. These tools provide a novel approach for studying transcription factor binding sites, especially those of paralogous transcription factors which share highly similar core-binding motifs but bind different genome sites in vivo. Our approach can also be used to study less-specific protein–DNA interactions, for example, histone–DNA interactions in nucleosomes. Here, we present structural profiles of transcription factor and nucleosome binding sites of various organisms and provide evidence that DNA shape generally plays an important role in protein–DNA recognition. Specifically, we present insights in how paralogous transcription factors recognize DNA shape to achieve differential binding specificity and suggest that mechanisms for nucleosome formation are somewhat different in Plasmodium falciparum whose genome has an extremely high A/T content compared with Saccharomyces cerevisiae and Drosophila melanogaster.
Read full abstract