Abstract

In metazoans, thousands of protein-coding genes must be differentially expressed in specific cell types, during development, and in response to a wide variety of extracellular signals. Combinatorial gene regulation strategies are required to generate these diverse expression patterns because only a limited number of transcription factors can be encoded by a limited genome. For a gene to be activated, transcription factors must bind distant control regions and promote the decondensation of repressed chromatin. Then, factors bound to distant control regions and the promoter must stimulate the remodeling of individual nucleosomes and transcription initiation by RNA polymerase II, via effective communication with nucleosome remodeling complexes, coactivator complexes, and the general transcription machinery (Lemon and Tjian 2000). Another important feature of combinatorial regulation is the requirement for several distinct transcription factors to activate a gene (Merika and Thanos 2001). By employing combinations of factors, the number of gene expression patterns that can be achieved is greatly enhanced. Although combinatorial regulation has been widely studied, one potential contributor has received relatively little attention: the core promoter, which is located between approximately −35 and +35 relative to the transcription start of a metazoan gene. One reason the core promoter generally was not considered to be an active contributor to combinatorial regulation is historical; when the first protein-coding genes were isolated, virtually every gene, regardless of its expression pattern, contained an A/T-rich sequence 25–30 base pairs (bp) upstream of the transcription start site (Breathnach and Chambon 1981). This sequence, with the consensus TATAAA, was called the TATA box. Following the development of functional assays, mutations in TATA boxes were found to reduce transcription initiation and prevent the proper positioning of transcription start sites. Based on these early observations, it was expected that a similar core promoter structure would be found in every cellular gene. The regulation of transcription was expected to rely exclusively on DNA-binding proteins that interact with distal promoters and enhancers. Today, this simplistic picture of the structure of core promoters for protein-coding genes has been replaced by a level of complexity that is not yet fully understood or appreciated. Most likely, the initial similarity resulted from the fact that the first promoters for RNA polymerase II were identified in DNA viruses and highly expressed cellular genes, which often contain TATA boxes. As more and more promoters for cellular genes have been isolated, the extensive similarity has vanished. In Drosophila, most core promoters for protein-coding genes fall into two distinct classes (Burke et al. 1998; Kutach and Kadonaga 2000). Roughly half contain a TATA box 25 to 30 bp upstream of the transcription start site combined with an initiator (Inr) element overlapping the start site. The other half contain an Inr element combined with a downstream promoter element (DPE), which is located approximately 30 bp downstream of the start site. Importantly, all three of these elements serve as recognition sites for subunits of transcription factor IID (TFIID), which contains the TATA-binding protein (TBP) and several TBP-associated factors (TAFs) (for review, see Burke et al. 1998; Smale et al. 1998). In mammals, core promoter structure appears to be even more diverse. Precise calculations have been difficult because transcription start sites have been determined accurately for only a small fraction of genes. Nevertheless, the available data suggest that (1) a smaller percentage of mammalian promoters than Drosophila promoters contain TATA boxes, (2) TATA boxes are paired with Inr elements in a smaller percentage of mammalian promoters, (3) DPE elements exist in mammalian promoters, but have been difficult to identify, and (4) many promoters, including a number of promoters within CpG islands, appear to lack all three of these core elements. The diversity of core promoter structure leads to two general questions that are of fundamental importance for an understanding of transcriptional control. First, what are the similarities and differences between the mechanisms of transcription initiation catalyzed by the various core promoter classes? Second, why does core promoter diversity exist? Although the first question has been explored in a number of studies (for reviews, see Burke et al. 1998; Smale et al. 1998; Lemon and Tjian 2000), little is known about the second. One possible reason for core promoter diversity is that the different classes of core promoters may have evolved E-MAIL steves@hhmi.ucla.edu; FAX (310) 206-8623. Article and publication are at http://www.genesdev.org/cgi/doi/10.1101/ gad.937701.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call