We have isolated and analyzed human CTCF cDNA clones and show here that the ubiquitously expressed 11-zinc-finger factor CTCF is an exceptionally highly conserved protein displaying 93% identity between avian and human amino acid sequences. It binds specifically to regulatory sequences in the promoter-proximal regions of chicken, mouse, and human c-myc oncogenes. CTCF contains two transcription repressor domains transferable to a heterologous DNA binding domain. One CTCF binding site, conserved in mouse and human c-myc genes, is found immediately downstream of the major P2 promoter at a sequence which maps precisely within the region of RNA polymerase II pausing and release. Gel shift assays of nuclear extracts from mouse and human cells show that CTCF is the predominant factor binding to this sequence. Mutational analysis of the P2-proximal CTCF binding site and transient-cotransfection experiments demonstrate that CTCF is a transcriptional repressor of the human c-myc gene. Although there is 100% sequence identity in the DNA binding domains of the avian and human CTCF proteins, the regulatory sequences recognized by CTCF in chicken and human c-myc promoters are clearly diverged. Mutating the contact nucleotides confirms that CTCF binding to the human c-myc P2 promoter requires a number of unique contact DNA bases that are absent in the chicken c-myc CTCF binding site. Moreover, proteolytic-protection assays indicate that several more CTCF Zn fingers are involved in contacting the human CTCF binding site than the chicken site. Gel shift assays utilizing successively deleted Zn finger domains indicate that CTCF Zn fingers 2 to 7 are involved in binding to the chicken c-myc promoter, while fingers 3 to 11 mediate CTCF binding to the human promoter. This flexibility in Zn finger usage reveals CTCF to be a unique "multivalent" transcriptional factor and provides the first feasible explanation of how certain homologous genes (i.e., c-myc) of different vertebrate species are regulated by the same factor and maintain similar expression patterns despite significant promoter sequence divergence.
Read full abstract