The 5′ termini of polyoma virus early region transcripts synthesized during the productive infection of permissive mouse cells by wild-type or tsa virus, and those expressed in a variety of transformed rodent cell lines, have been mapped on the viral genome. Results were obtained using the S 1 nuclease and primer extension gel mapping procedures. Principal 5′ ends of cytoplasmic polyadenylated mRNAs in every instance mapped between nucleotides 145 and 156 (numbering according to Soeda et al., 1980) in the DNA sequence, 17 to 28 base-pairs before the translational initiation codon for early proteins and 26 to 37 base-pairs after a sequence agreeing with the TATA box consensus. Our data implied a minimum of two different termini with this region of the genome, at nucleotide 147 ± 2 and at 152 ± 2. The 5′ ends at 147 ± 2 were particularly common in the mRNA overproduced after thermal inactivation of the large T protein at late times during infection. The sequences determining the principal 5′ termini, unlike the analogous sequences in the closely related simian virus 40 (SV40), are distinct from those involved in the viral origin of DNA replication. A number of minor alternative 5′ termini of cytoplasmie mRNAs. located both before and after the principal 5′ ends, were also detected. Of those downstream from the principal termini, one at nucleotide 300 ± 2 was prominent. Although this apparent 5′ end is well within the early protein coding sequence, it occurs at a position 31 ± 2 base-pairs after a second TATA box. The several minor apparent 5′ termini mapping upstream of the principal termini occurred primarily in the vicinity of the highly conserved papovavirus DNA replication origin sequence. This sequence includes a third TATA box. Two of the minor 5′ termini, at 14 ± 2 and at 20 ± 2. were near the consensus distance from this TATA box, but the others mapped within or before it. Early region mRNA extracted at late times from the cytoplasm of cells infected with wild-type virus, or with tsa virus at the permissive temperature, was usually (but not invariably) enriched for RNA species with apparent 5′ termini mapping in the replication origin region, as well as for even longer RNAs. Such RNAs were correctly spliced and had the normal polyadenylated 3′ ends. They were very minor in the cytoplasmic mRNA overproduced after thermal inactivation of the large T protein at late times during infection, but the nuclear RNA from these cells comprised giant species with highly heterogeneous apparent 5′ ends, including predominantly those in the origin region and others further upstream. Nuclear viral RNA from most transformed cell lines lacked the giant species and had principal 5′ termini in the nt145 to 156 region. We consider two models to account for the presence of the longer mRNAs in the cytoplasm of infected cells at late times during infection. The first postulates a shift in transcriptional initiation sites to upstream positions because of the repressor action of the large T protein. The second, which we favour, proposes that the longer species occur because of inefficient transcriptional termination, which leads to transcription around the entire circular genome and consequently to the eventual accumulation of long cleavage products in the cytoplasm. We further studied the mRNAs expressed by three viable deletion mutants (Bendig et al., 1980) that lack all or part of the sequence determining the principal 5′ termini and the TATA box that precedes it. These efficiently synthesized early region mRNA with slightly or highly heterogeneous 5′ ends. Two of the mutants (dl-75 and dl-17) produced mRNAs with principal alternative 5′ ends located slightly before or after the deletions. The 5′ ends of these mutant mRNAs corresponded to those of very minor transcripts of wild-type templates. These results suggest that sequence information other than the TATA box has an important role in specifying the approximate position of mRNA 5′ ends.
Read full abstract