Abstract

Accurate identification of transcription factor binding sites is critical to our understanding of transcriptional regulatory networks. To overcome the issue of high false-positive predictions that trouble the sequence-based prediction techniques, we have developed a structure-based prediction method that takes into consideration of interactions between the amino acids of a transcription factor and the nucleotides of its DNA binding sequence at structural level, along with an efficient protein-DNA docking algorithm. The docked structures between a protein and a DNA are evaluated using a knowledge-based energy function, in conjunction with van der Waals energy. Our docking algorithm supports quasi-flexible docking, overcoming a number of limiting issues faced by similar docking algorithms. Our rigid-body docking algorithm is tested on a dataset of 141 nonredundant transcription factor-DNA complex structures. The test results show that 63.1% of the 141 complex structures are reconstructed with accuracies better than 1.0 A RMSDs (root mean square deviation) and 79.4% of the complexes are predicted with accuracies better than 3.0 A RMSDs when using the native DNA structures. Our quasi-flexible docking algorithm, assuming that the DNA structures are not known, is tested on a separate set of 45 transcription factor-DNA complexes, of which 57.8% of the docked complex conformations achieve better than 1.0 A RMSDs while 71.1% of the complexes have RMSDs less than 3.0 A. We have also applied our method to predict the binding motifs of the ferric uptake regulator in E. coli and showed that most of the experimentally identified sites can be predicted accurately.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call