Abstract

Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.

Highlights

  • The DNA binding specificities of transcription factors (TFs) can be described as consensus sequences or position frequency matrices (PFMs) representing the probability of occurrence of each nucleotide at each position of a DNA binding site

  • In the present study, we describe the development of our TFBSshape database, which provides DNA structural features for nucleotide sequences preferred by different TFs

  • TFBSshape derives TF binding sites (TFBSs) sequence information from the motif databases JASPAR [25] and UniPROBE [26] and generates DNA shape data for TFBSs based on the highthroughput prediction of DNA structural features, including the parameters minor groove width (MGW), Roll, propeller twist (ProT) and helix twist (HelT) [24]

Read more

Summary

Introduction

The DNA binding specificities of transcription factors (TFs) can be described as consensus sequences or position frequency matrices (PFMs) representing the probability of occurrence of each nucleotide at each position of a DNA binding site. In the present study, we describe the development of our TFBSshape database, which provides DNA structural features for nucleotide sequences preferred by different TFs. We analysed 739 datasets derived from open-access motif databases that describe the DNA binding specificities of TFs from 23 different species.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call