MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

Jonathan Verneau,Bernard La Scola,Anthony Levasseur,Philippe Colson,Didier Raoult

doi:10.3389/fmicb.2016.00428

Abstract

The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body.

Highlights

The first giant virus of amoeba, Mimivirus, was isolated in 2003 from a water sample by co-culturing on Acanthamoeba polyphaga, a strategy implemented to find Legionella-like bacteria (La Scola et al, 2003; Raoult et al, 2007)
The pipeline dedicated to the search for giant virus-related sequences in metagenomes comprises several scripts written in Python language and include independent modules (Figure 1)
MG-Digger, a user-friendly computational tool implemented in our laboratory for the detection of Megavirales-like or virophagelike sequences in metagenomes, automatically generated readyto-analyze metagenome files and annotated 100s of sequences as significantly matching those of giant viruses or virophages

Summary

Introduction

The first giant virus of amoeba, Mimivirus, was isolated in 2003 from a water sample by co-culturing on Acanthamoeba polyphaga, a strategy implemented to find Legionella-like bacteria (La Scola et al, 2003; Raoult et al, 2007). They comprised new viral families, including Mimiviridae (La Scola et al, 2008; Pagnier et al, 2013) and Marseilleviridae (Boyer et al, 2009; Colson et al, 2012b; Pagnier et al, 2013), and two new putative viral families including pandoravirus isolates (currently the largest known viruses; Philippe et al, 2013), and Pithovirus sibericum (Legendre et al, 2014) These giant viruses were related to the group of nucleocytoplasmic large DNA viruses (NCLDVs) described since 2001 as being composed of five viral families: Ascoviridae, Asfarviridae, Iridoviridae, Phycodnaviridae, and Poxviridae, whose members infect a wide variety of eukaryotic hosts (Iyer et al, 2001; Yutin et al, 2009). The size of these virions and their gene complements has changed our view of the viral world and its diversity, and has called into question the definition and classification of viruses (Raoult and Forterre, 2008; Colson et al, 2012a; Raoult, 2014)

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in microbiology	Publication Date: Mar 31, 2016
Citations: 26	License type: cc-by

R Discovery Prime

R Discovery Prime

MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in microbiology

Lead the way for us

Similar Papers

Decision letter: Virophages and retrotransposons colonize the genomes of a heterotrophic flagellate
Chantal Abergel ... George H Perry
-
Chantal Abergel, et. al.Chantal Abergel ... George H Perry
13 Sep 2021
13 Sep 2021

Giant Viruses of Amoebas: An Update.
Sarah Aherfi ... Bernard La Scola
Frontiers in Microbiology | VOL. 7
Sarah Aherfi, et. al.Sarah Aherfi ... Bernard La Scola
22 Mar 2016
Frontiers in Microbiology | VOL. 7

A Phylogenomic Study of Acanthamoeba polyphaga Draft Genome Sequences Suggests Genetic Exchanges With Giant Viruses.
Nisrine Chelkha ... Philippe Colson
Frontiers in Microbiology | VOL. 9
Nisrine Chelkha, et. al.Nisrine Chelkha ... Philippe Colson
06 Sep 2018
Frontiers in Microbiology | VOL. 9

Estimating evolutionary rates in giant viruses using ancient genomes.
Sebastián Duchêne ... Edward C Holmes
Virus Evolution | VOL. 4
Sebastián Duchêne, et. al.Sebastián Duchêne ... Edward C Holmes
01 Jan 2018
Virus Evolution | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in microbiology