Abstract

BackgroundViruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora.ResultsVirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/.ConclusionsWe developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder—the most widely used tools—VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics.Trial registrationThe European Union Clinical Trials Register, EudraCT Number: 2013-003378-28. Registered on 9 April 2014

Highlights

  • Viruses are essential constituents of microbial communities contributing to their homeostasis and evolution

  • Updated phage orthologous groups database The majority of the POGs in the database initially developed in 2012 by Kristensen et al [27] (POG2012) are virus-specific, with low homology regions compared to prokaryotic genomes

  • We further identified virus-specific POGs that can help to distinguish prophage genes from other components in microbial genomes, based on the virus quotient (VQ), which was measured as the quotient of the frequency of matches to viral genome [27]

Read more

Summary

Introduction

Viruses are essential constituents of microbial communities contributing to their homeostasis and evolution. Phages can modulate the structure and function of a bacterial community through horizontal gene transfer (HGT) [2], thereby altering the bacterial phenotypes including virulence, antibiotic resistance, and biofilm formation [3,4,5]. Such phage-induced alterations could pose potential health risks by influencing bacterial pathogenicity and antibiotic resistance. Through an exploratory bioinformatics strategy, they identified two known and 421 newly predicted ARGs in 1181 publicly available phage genomes Their experimental tests expressing four predicted ARGs in Escherichia coli did not lead to increased antibiotic resistance. We employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call