Abstract

Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity are, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.

Highlights

  • Human and animal populations are continuously confronted with emerging viral infections (Delwart, 2007; Lipkin, 2010; Smits and Osterhaus, 2013)

  • DATASETS The first metagenome dataset was obtained from a cell culture supernatant (CCS) containing a rhabdovirus-like virus isolated from tissue collected from a stranded white-beaked dolphin (Lagenorhynchus albirostris) (Osterhaus et al, 1993; Siegers et al, 2014)

  • Genetic and phylogenetic characterization of the dolphin rhabdovirus (DRV) revealed that it was closely related to rhabdoviruses of the genera Perhabdovirus and Vesiculovirus found in fish (Siegers et al, 2014)

Read more

Summary

Introduction

Human and animal populations are continuously confronted with emerging viral infections (Delwart, 2007; Lipkin, 2010; Smits and Osterhaus, 2013). In order to discover and characterize new or (re-) emerging viruses, metagenome sequencing is increasingly being used to identify viral pathogens. These techniques are more and more often being used to generate complete genomes of uncultivated viruses, and other organisms (Delwart, 2007; Lipkin, 2010; Iverson et al, 2012; Albertsen et al, 2013; Smits and Osterhaus, 2013; Handley et al, 2014). Inherent to the approach is that a large fraction of the metagenome consists of sequences of other organisms than the viral targets, including host sequences, archaea, bacteria, and bacteriophages, despite physical enrichment strategies for virus particles that are often applied (Van Leeuwen et al, 2010; Kostic et al, 2012; Van Den Brand et al, 2012; Wylie et al, 2012; Bodewes et al, 2013; Schurch et al, 2014)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call