Abstract

Although the use of long-read sequencing improves the contiguity of assembled viral genomes compared to short-read methods, assembling complex viral communities remains an open problem. We describe the viralFlye tool for identification and analysis of metagenome-assembled viruses in long-read assemblies. We show it significantly improves viral assemblies and demonstrate that long-reads result in a much larger array of predicted virus-host associations as compared to short-read assemblies. We demonstrate that the identification of novel CRISPR arrays in bacterial genomes from a newly assembled metagenomic sample provides information for predicting novel hosts for novel viruses.

Highlights

  • Various metagenomic studies have greatly expanded the set of known viral genomes [1,2,3,4,5,6] and have raised the challenge of inferring the metagenome-assembled viruses (MAVs)

  • Short-read sequencing has been the dominant technology for the discovery of novel MAVs [2]

  • We show that long reads improve the accuracy of virus-host association predictions based on matching of the CRISPR-Cas sites

Read more

Summary

Introduction

Various metagenomic studies have greatly expanded the set of known viral genomes [1,2,3,4,5,6] and have raised the challenge of inferring the metagenome-assembled viruses (MAVs). Since the International Committee on Taxonomy of Viruses has proposed to include MAVs into viral taxonomy studies [7], there is a need for novel bioinformatics tools to accurately assemble, identify, verify, analyze, and classify MAVs. So far, short-read sequencing has been the dominant technology for the discovery of novel MAVs [2]. Short-read sequencing has been the dominant technology for the discovery of novel MAVs [2] Such discoveries are usually conducted by assembling a viral metagenome (virome) using general-purpose metagenomic assemblers (such as metaSPAdes [8] or Megahit [9]), or specialized viral assemblers (such as metaviralSPAdes [10]). Complete sequencing of giant viruses has been a challenging task [12, 13]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call