Abstract

BackgroundLong-read sequencing has shown its tremendous potential to address genome assembly challenges, e.g., achieving the first telomere-to-telomere assembly of a gapless human chromosome. However, many issues remain unresolved when leveraging error-prone long reads to characterize high-complexity metagenomes, for instance, complete/high-quality genome reconstruction from highly complex systems.ResultsHere, we developed an iterative haplotype-resolved hierarchical clustering-based hybrid assembly (HCBHA) approach that capitalizes on a hybrid (error-prone long reads and high-accuracy short reads) sequencing strategy to reconstruct (near-) complete genomes from highly complex metagenomes. Using the HCBHA approach, we first phase short and long reads from the highly complex metagenomic dataset into different candidate bacterial haplotypes, then perform hybrid assembly of each bacterial genome individually. We reconstructed 557 metagenome-assembled genomes (MAGs) with an average N50 of 574 Kb from a deeply sequenced, highly complex activated sludge (AS) metagenome. These high-contiguity MAGs contained 14 closed genomes and 111 high-quality (HQ) MAGs including full-length rRNA operons, which accounted for 61.1% of the microbial community. Leveraging the near-complete genomes, we also profiled the metabolic potential of the AS microbiome and identified 2153 biosynthetic gene clusters (BGCs) encoded within the recovered AS MAGs.ConclusionOur results established the feasibility of an iterative haplotype-resolved HCBHA approach to reconstruct (near-) complete genomes from highly complex ecosystems, providing new insights into “complete metagenomics”. The retrieved high-contiguity MAGs illustrated that various biosynthetic gene clusters (BGCs) were harbored in the AS microbiome. The high diversity of BGCs highlights the potential to discover new natural products biosynthesized by the AS microbial community, aside from the traditional function (e.g., organic carbon and nitrogen removal) in wastewater treatment.9KSDkcVmw3WwVH4iLno4ieVideo

Highlights

  • Rapid advances in long-read sequencing, known as third-generation sequencing, achieved by Pacific Bioscience (PacBio) and Oxford Nanopore Technology (ONT), have demonstrated their ability to resolve genome assembly challenges [1,2,3], e.g., long repeat regions and structural variants [1, 4, 5]

  • Seven out of eight bacterial species in the Mock dataset were assembled into single-contig genomes, with three representing full circular genomes

  • We found a large proportion of genomes harboring many of the above secondary metabolite potentials but with no identified antibiotic resistance genes (ARGs) (Additional file 2, Fig. S8b); this may suggest the current limitation of the ARGs database and insufficient understanding of resistance mechanisms

Read more

Summary

Introduction

Rapid advances in long-read sequencing, known as third-generation sequencing, achieved by Pacific Bioscience (PacBio) and Oxford Nanopore Technology (ONT), have demonstrated their ability to resolve genome assembly challenges [1,2,3], e.g., long repeat regions and structural variants [1, 4, 5]. Even with extensive short-read-based polishing, it is still challenging to generate a decent number of high-accuracy sequences, especially for metagenomes with high complexity [8]. Select assemblers have been developed to take advantages of long reads, including the hybrid assemblers Unicycler (for bacterial isolates) [9] and OPERA-MS (for clinical metagenomics) [2], and long-read assemblers Canu [10] and Flye [11]. When implementing these assemblers to the complex environmental samples, limitations do apply, e.g., limited feasibility, accuracy, and demanding computational resources. Many issues remain unresolved when leveraging error-prone long reads to characterize high-complexity metagenomes, for instance, complete/high-quality genome reconstruction from highly complex systems

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call