Abstract
BackgroundObtaining high-quality (HQ) reference genomes from microbial communities is crucial for understanding the phylogeny and function of uncultured microbes in complex microbial ecosystems. Despite improvements in bioinformatic approaches to generate curated metagenome-assembled genomes (MAGs), existing metagenome binners obtain population consensus genomes but they are nowhere comparable to genomes sequenced from isolates in terms of strain level resolution. Here, we present a framework for the integration of single-cell genomics and metagenomics, referred to as single-cell (sc) metagenomics, to reconstruct strain-resolved genomes from microbial communities at once.ResultsOur sc-metagenomics integration framework, termed SMAGLinker, uses single-cell amplified genomes (SAGs) generated using microfluidic technology as binning guides and integrates them with metagenome-assembled genomes (MAGs) to recover improved draft genomes. We compared sc-metagenomics with the metagenomics-alone approach using conventional metagenome binners. The sc-metagenomics approach showed precise contig binning and higher recovery rates (>97%) of rRNA and plasmids than conventional metagenomics in genome reconstruction from the cell mock community. In human microbiota samples, sc-metagenomics recovered the largest number of genomes with a total of 103 gut microbial genomes (21 HQ, with 65 showing >90% completeness) and 45 skin microbial genomes (10 HQ, with 40 showing >90% completeness), respectively. Conventional metagenomics recovered one Staphylococcus hominis genome, whereas sc-metagenomics recovered two S. hominis genomes from identical skin microbiota sample. Single-cell sequencing revealed that these S. hominis genomes were derived from two distinct strains harboring specifically different plasmids. We found that all conventional S. hominis MAGs had a substantial lack or excess of genome sequences and contamination from other Staphylococcus species (S. epidermidis).ConclusionsSMAGLinker enabled us to obtain strain-resolved genomes in the mock community and human microbiota samples by assigning metagenomic sequences correctly and covering both highly conserved genes such as rRNA genes and unique extrachromosomal elements, including plasmids. SMAGLinker will provide HQ genomes that are difficult to obtain using metagenomics alone and will facilitate the understanding of microbial ecosystems by elucidating detailed metabolic pathways and horizontal gene transfer networks. SMAGLinker is available at https://github.com/kojiari/smaglinker.3Xqp2i4EXkShdVYdkpPFdnVideo abstract
Highlights
Obtaining high-quality (HQ) reference genomes from microbial communities is crucial for understanding the phylogeny and function of uncultured microbes in complex microbial ecosystems
Our single-cell genomics and metagenomics integration framework, called SMAGLinker, uses single-cell amplified genome (SAG), which are produced from the same sample, as teaching data for metagenome binning (Fig. 1)
Comparing characteristics of single-cell genome-guided bins with conventional metagenomic bins We evaluated the characteristics of bins collected using sc-metagenomics with SMAGLinker and metagenomicsalone approaches with conventional metagenome binners (Fig. 2)
Summary
Obtaining high-quality (HQ) reference genomes from microbial communities is crucial for understanding the phylogeny and function of uncultured microbes in complex microbial ecosystems. State-of-the-art binners rely on nucleotide compositional information such as tetranucleotide frequency, GC content, or sequence coverage [10,11,12] These tools demonstrate different performances and produce different MAGs, including incomplete bins and multi-species composite bins [13]. Composite genomes that aggregate sequences originating from multiple distinct species or strains can yield misleading insights if they are registered as single genomes in the reference database [14]. To solve these problems, several approaches combine and curate the result of multiple binners to generate a large number of high-quality (HQ) genomes [13, 15, 16]. In real-world samples, it is difficult to verify binning results because there are numerous microbes without the reference genome and the proportion of microbial species richness among them is unknown
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have