Abstract

ABSTRACTVirus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order “Megavirales” with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections.

Highlights

  • Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes

  • In order to develop a network representation of the relationships between all major groups of double-stranded DNA (dsDNA) viruses, we first had to identify the families of homologs that would become the nodes of the “gene family” class

  • A comparison of the gene families obtained through this pipeline and the available clusters of orthologous genes for bacteriophages (POGs) [9, 35] and large nucleo-cytoplasmic DNA viruses of eukaryotes (NCVOGs) [36, 37] yielded a recall of 0.92, and purity (1 minus the average fraction of false positives) of 0.89

Read more

Summary

Introduction

Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. An expansive supermodule combines the large and giant viruses of the putative order “Megavirales” with diverse moderate-sized viruses and related mobile elements All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. With the possible exception of some intracellular parasitic bacteria with highly degraded genomes, viruses and/or other selfish elements, such as transposons and plasmids, parasitize all cellular organisms Complementary to their physical dominance in the biosphere, viruses collectively appear to encompass the bulk of the genetic diversity on Earth [7,8,9]. The broader connectivity of the evolutionary network in the virus world derives from a small group of genes that have been termed virus hallmark genes, which encode key proteins involved in genome replication and virion formation and are shared by overlapping sets of diverse viruses [17,18,19]. Recent application of network analysis methods to the comparative analysis of microbial and bacteriophage genomes has been productive, in particular, for the identification of preferred routes and patterns of horizontal gene transfer (HGT) [27,28,29]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call