MinYS: mine your symbiont by targeted genome assembly in symbiotic communities

Cervin Guyomar,Claire Lemaitre,Wesley Delage,Jean-Christophe Simon,Fabrice Legeai,Christophe Mougel

doi:10.1093/nargab/lqaa047

Abstract

Most metazoans are associated with symbionts. Characterizing the effect of a particular symbiont often requires getting access to its genome, which is usually done by sequencing the whole community. We present MinYS, a targeted assembly approach to assemble a particular genome of interest from such metagenomic data. First, taking advantage of a reference genome, a subset of the reads is assembled into a set of backbone contigs. Then, this draft assembly is completed using the whole metagenomic readset in a de novo manner. The resulting assembly is output as a genome graph, enabling different strains with potential structural variants coexisting in the sample to be distinguished. MinYS was applied to 50 pea aphid resequencing samples, with variable diversity in symbiont communities, in order to recover the genome sequence of its obligatory bacterial symbiont, Buchnera aphidicola. It was able to return high-quality assemblies (one contig assembly in 90% of the samples), even when using increasingly distant reference genomes, and to retrieve large structural variations in the samples. Because of its targeted essence, it outperformed standard metagenomic assemblers in terms of both time and assembly quality.

Highlights

Advances of molecular techniques have greatly contributed to the recognition of the importance of microorganisms in every ecosystem
These datasets are unbalanced: the great majority of the reads often originate from the host genome, but since the genomes of the symbionts are often several orders of magnitude smaller than that of the eukaryotic host, symbiont genomes can have large read depth in such samples. This enables the extraction of relevant information about the symbionts, but requires significant effort, since the host reads are a computational burden for most analyses. In this context, providing bioinformatic tools that enable the assembly of a particular genome of interest from a metagenomic sample, ignoring the overwhelming amount of reads from other organisms, would greatly accelerate the characterization of symbiont genomes, and decipher particular host–symbiont relationships
The number of reads is on average 84 [198] million for individual sequencing datasets, with an average coverage of 628× (3694×) for the B. aphidicola genome. In these datasets, >90% of the reads originate from the insect host and are not useful when focusing on symbiont genomes

Summary

Introduction

Advances of molecular techniques have greatly contributed to the recognition of the importance of microorganisms in every ecosystem. As symbionts are generally not cultivable outside the host, the whole community is usually sequenced, resulting in a metagenomic dataset mixing host and symbiont reads These datasets are unbalanced: the great majority of the reads often originate from the host genome, but since the genomes of the symbionts are often several orders of magnitude smaller than that of the eukaryotic host, symbiont genomes can have large read depth in such samples. This enables the extraction of relevant information about the symbionts, but requires significant effort, since the host reads are a computational burden for most analyses. In this context, providing bioinformatic tools that enable the assembly of a particular genome of interest from a metagenomic sample, ignoring the overwhelming amount of reads from other organisms, would greatly accelerate the characterization of symbiont genomes, and decipher particular host–symbiont relationships

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: NAR Genomics and Bioinformatics	Publication Date: Jul 3, 2020
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

MinYS: mine your symbiont by targeted genome assembly in symbiotic communities

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NAR Genomics and Bioinformatics

Lead the way for us

Similar Papers

Positional bias in variant calls against draft reference assemblies
Roman V Briskine ... Kentaro K Shimizu
BMC Genomics | VOL. 18
Roman V Briskine, et. al.Roman V Briskine ... Kentaro K Shimizu
28 Mar 2017
BMC Genomics | VOL. 18

Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses
Zhao Chen ... Jianghong Meng
Genomics | VOL. 113
Zhao Chen, et. al.Zhao Chen ... Jianghong Meng
11 Mar 2021
Genomics | VOL. 113

ABACAS: algorithm-based automatic contiguation of assembled sequences
Samuel Assefa ... Thomas D Otto
Bioinformatics | VOL. 25
Samuel Assefa, et. al.Samuel Assefa ... Thomas D Otto
03 Jun 2009
Bioinformatics | VOL. 25

Geography-dependent symbiont communities in two oligophagous aphid species.
Shifen Xu ... Man Qin
FEMS Microbiology Ecology | VOL. 97
Shifen Xu, et. al.Shifen Xu ... Man Qin
10 Sep 2021
FEMS Microbiology Ecology | VOL. 97

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MinYS: mine your symbiont by targeted genome assembly in symbiotic communities

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NAR Genomics and Bioinformatics