Abstract
BackgroundStrain-level RNA virus characterization is essential for developing prevention and treatment strategies. Viral metagenomic data, which can contain sequences of both known and novel viruses, provide new opportunities for characterizing RNA viruses. Although there are a number of pipelines for analyzing viruses in metagenomic data, they have different limitations. First, viruses that lack closely related reference genomes cannot be detected with high sensitivity. Second, strain-level analysis is usually missing.ResultsIn this study, we developed a hybrid pipeline named TAR-VIR that reconstructs viral strains without relying on complete or high-quality reference genomes. It is optimized for identifying RNA viruses from metagenomic data by combining an effective read classification method and our in-house strain-level de novo assembly tool. TAR-VIR was tested on both simulated and real viral metagenomic data sets. The results demonstrated that TAR-VIR competes favorably with other tested tools.ConclusionTAR-VIR can be used standalone for viral strain reconstruction from metagenomic data. Or, its read recruiting stage can be used with other de novo assembly tools for superior viral functional and taxonomic analyses. The source code and the documentation of TAR-VIR are available at https://github.com/chjiao/TAR-VIR.
Highlights
Strain-level RNA virus characterization is essential for developing prevention and treatment strategies
In order to test whether increased levels of anelloviruses or other viruses in plasma are associated with higher levels of persistent T-cell activation during anti-retroviral therapy (ART), Li et al detected all viruses using metagenomic data of plasma samples from 19 adults on effective ART [4]
Overview of our work Here we introduce TAR-VIR, which provides a useful addition to existing tools for identifying targeted RNA viruses and their haplotypes in metagenomic data
Summary
Strain-level RNA virus characterization is essential for developing prevention and treatment strategies. Pathogenic human viruses such as human immunodeficiency virus (HIV), hepatitis C virus (HCV), Severe Acute Respiratory Syndrome (SARS) coronavirus (SARS-CoV), and H1N1 flu virus, still claim millions of lives each year despite centuries studies of the vaccine and treatment [1, 2]. There are global-scale studies on viruses in natural environmental samples such as ocean water [6, 7]. In addition to these examples, a more comprehensive review about the studies using viral metagenomic data in diagnostics, surveillance and outbreak source tracing, and biodiversity studies can be found in [8]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have