Abstract

With the complete sequence of the human genome still being a long-standing goal, we are nevertheless obviously at the threshold of the era of real genome sequencing. Undoubtedly, the first relatively modest signs of its advent, i.e. the complete sequences of the yeast chromosome III (Oliver et al., 1992), of a portion of the Caenorhabditis elegans DNA (Sulston et al., 1992), and of a human brain cDNA library (Adams et al., 1992), will be followed by a flood of further, more significant contributions. Currently, however, complete sequences are available only for viral, plasmid and some organelle genomes. Among these only viruses provide us with the unique opportunity to compare a set of complete sequences of related genomes with varying levels of similarity. Viral genomes range in size from about two to about 300 kilobases (Francki et al., 1991). The sequencing of a small virus genome being by now an almost trivial pursuit, the number of known sequences of this type counts by many dozens, and at least 7 sequences of large viral DNAs of over 100 kb in length have been reported. Perhaps unexpectedly, it can be argued that this set comprises a considerable fraction of the overall variety of viruses. Indeed, despite the rapid accumulation of the overall sequence information, the number of really new virus genome sequences representing distinct virus groups grows only linearly or slower (Fig. 1). Moreover, examination of the latest list of such groups (Francki et al., 1991) shows that, at least as regards viruses with small genomes, sequences are already available for most of the genera and higher taxonomic units, and although this is not yet clearly visible (Fig. 1), we may be approaching the saturation of the sequence information growth. In fact, to the best of my knowledge, not a single sequence representing a new group of small viruses has been reported in the first half of 1992. Obviously, the situation is quite different at the species and lower levels. Numerous new isolates and strains of viruses are being sequenced constantly but they increasingly tend to fall within already recognized groups. Of course, the collection of viral sequences now available is biased because of the present research strategy that only makes for the detailed study of viruses infecting man and economically important animals, plants and microorganisms. Nevertheless, the existence of modest but significant sequence similarities between genome products of viruses infecting eukaryotes and eubacteria (e.g., Dolja and Carrington, 1992; Koonin, 1991; Spicer et al., 1988) may suggest that the diversity of the genome structure of viruses as a whole might not reach far beyond the boundaries set by the current collection. To what extent is the knowledge gained through analysis of viral genomes useful for the analysis of the much larger cellular genomes? The answer is not trivial because the difference between the two types of genomes is not only quantitative. Being obligatory intracellular parasites, viruses, even those with the largest genomes, have greatly reduced sets of genes as compared with the cellular genomes. They normally encode neither enzymes of intermediate metabolism nor proteins of the translation apparatus. All viruses have genes coding for at least some of the proteins mediating genome replication and expression but even this group of viral genes is usually incomplete, with the extent of recruitment of cellular proteins varying in

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.