Abstract

Comprehensive comparison of genes of different kinds of species is becoming feasible by studying messenger RNA sequences, 29 of which have been published that contain 50 or more completely determined continuous codons. These 29 sequences are analyzed here for several characteristics believed useful for distinguishing molecular strategies of evolution. Of the three viral genomes entirely sequenced, MS2 is a single-stranded RNA coliphage [I], $X174 a single-stranded DNA coliphage [2], and SV40 a double-stranded oncogenic primate virus [3]. Use of the codon catalog varies greatly among these viruses. For example, codons of the type NCG (where N is any base) are completely absent in SV40, but relatively abundant in other viruses [l-7]. It was noted early that G tends to be rare as third base of highly degenerate viral codons [8], and this has since been found generally true in MS2, $X174 and SV40. However, SV40 is the extreme case: for 326 of the 1511 codons in its best-defined translated sequences, opportunity exists for use of an NCG triplet, but each time A, C or U appears as third base. Also, the frequency order of degenerate bases in SV40 codons (C<G<A<u) is the same in each of the four main genes, but greatly different from that in animals or other viruses, as seen below. I now calculate frequencies of each kind of twobase sequence (doublet), following Russell and SubakSharpe [9], to aid in a quantitative differentiation of these viruses and other species. Among the 16 possible base doublets, CG has the lowest frequency by a wide margin in the SV40 genome. The CG doublet is rare, not merely as observed for the NCG codons (that is, in codon positions II-III), but also for the two other combinations of codon position (I-II and III-I). The CG doublet is rare, although to a lesser degree, in the untranslated part of the genome as well, as observed [3]. In this virus, therefore, the CG doublet is discriminated against independently of its potential for amino acid coding (in codon positions I-II, CG codes arginine). The CG doublet is more rare in SV40 than is any doublet in any other genome. This study may provide indications on viral origins and on the molecular constraints existing during gene and genome evolution. My analysis demonstrates that SV40 differs enormously from other viruses and from the coding nucleic acids of the mammalian cell it parasitizes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call