Abstract

Background: Virus discovery using high-throughput next-generation sequencing has become more commonplace. However, although analysis of deep next-generation sequencing data allows us to identity potential pathogens, the entire analytical procedure requires competency in the bioinformatics domain, which includes implementing proper software packages and preparing prerequisite databases. Simple and user-friendly bioinformatics pipelines are urgently required to obtain complete viral genome sequences from metagenomic data. Results: This manuscript presents a pipeline, drVM (detect and reconstruct known viral genomes from metagenomes), for rapid viral read identification, genus-level read partition, read normalization, de novo assembly, sequence annotation, and coverage profiling. The first two procedures and sequence annotation rely on known viral genomes as a reference database. drVM was validated via the analysis of over 300 sequencing runs generated by Illumina and Ion Torrent platforms to provide complete viral genome assemblies for a variety of virus types including DNA viruses, RNA viruses, and retroviruses. drVM is available for free download at: https://sourceforge.net/projects/sb2nhri/files/drVM/ and is also assembled as a Docker container, an Amazon machine image, and a virtual machine to facilitate seamless deployment. Conclusions: drVM was compared with other viral detection tools to demonstrate its merits in terms of viral genome completeness and reduced computation time. This substantiates the platform's potential to produce prompt and accurate viral genome sequences from clinical samples.

Highlights

  • Virus discovery using high-throughput next-generation sequencing has become more commonplace

  • The first two procedures and sequence annotation rely on known viral genomes as a reference database. drVM was validated via the analysis of over 300 sequencing runs generated by Illumina and Ion Torrent platforms to provide complete viral genome assemblies for a variety of virus types including DNA viruses, RNA viruses, and retroviruses. drVM is available for free download at: https://sourceforge.net/projects/sb2nhri/files/drVM/ and is assembled as a Docker container, an Amazon machine image, and a virtual machine to facilitate seamless deployment

  • As next-generation sequencing (NGS) technology is becoming a more common means to detect pathogens in clinical samples, our goal is to establish a simple and effective pipeline that allows accurate and rapid viral genome reconstruction from metagenomic NGS data generated from complex clinical samples

Read more

Summary

Introduction

Virus discovery using high-throughput next-generation sequencing has become more commonplace. DrVM was validated via the analysis of over 300 sequencing runs generated by Illumina and Ion Torrent platforms to provide complete viral genome assemblies for a variety of virus types including DNA viruses, RNA viruses, and retroviruses. Conclusions: drVM was compared with other viral detection tools to demonstrate its merits in terms of viral genome completeness and reduced computation time This substantiates the platform’s potential to produce prompt and accurate viral genome sequences from clinical samples. Over the past two decades, avian influenza H5N1 virus [4], SARS coronavirus, H1N1 pandemic, MERS coronavirus [5], Ebola virus [6], and Zika virus [7] have emerged in the human population During such outbreaks, identification of the causative agent and comparative genome analysis is of cornerstone importance for disease surveillance and epidemiology. The technique holds the promise to aid in the identification of potential pathogens in a single assay without a prior knowledge of the Received: 25 July 2016; Revised: 10 November 2016; Accepted: 15 January 2017

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call