THAP-family C2CH zinc-coordinating DNA-binding proteins function in diverse eukaryotic cellular processes, such as transposition, transcriptional repression, stem-cell pluripotency, angiogenesis and neurological function. To determine the molecular basis for sequence-specific DNA recognition by THAP proteins, we solved the crystal structure of the Drosophila melanogaster P element transposase THAP domain (DmTHAP) complexed with a natural 10-base pair site. In contrast to C2H2 zinc fingers, DmTHAP docks a conserved β-sheet into the major groove and a basic C-terminal loop into the adjacent minor groove. We confirmed specific protein-DNA interactions by mutagenesis and DNA binding assays. Sequence analysis of natural and in-vitro-selected binding sites suggests several THAPs (DmTHAP, human THAP1 and THAP9) recognize a bipartite TxxGGGx(A/T) consensus motif; homology suggests THAP proteins bind DNA through a bipartite interaction. These findings reveal the conserved mechanisms by which THAP-family proteins engage specific chromosomal target elements.
Read full abstract