Abstract

High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs inferred to originate from the same organism. Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics, such as mobile elements. Here, we propose that information from assembly graphs can assist current strategies for metagenomic binning. We use MetaCarvel, a metagenomic scaffolding tool, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads. We developed a tool, Binnacle, that extracts information from the assembly graphs and clusters scaffolds into comprehensive bins. Binnacle also provides wrapper scripts to integrate with existing binning methods. The Binnacle pipeline can be found on GitHub (https://github.com/marbl/binnacle). We show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed.

Highlights

  • Advances in high-throughput sequencing strategies have spurred microbiome research and revealed important insights into the microbial communities that inhabit human, animal, and environmental habitats (The Human Microbiome Project Consortium, 2012; Oh et al, 2014; Zeevi et al, 2019)

  • Whole metagenomic shotgun sequencing, which allows for a comprehensive analysis of microbial DNA from a sample, has been instrumental in expanding our understanding of the functional potential and genetic composition of different microorganisms that have not been previously cultured

  • Graph Scaffolds Improve Metagenome Binning been isolated is the reconstruction of their complete genome sequence (Uritskiy and DiRuggiero, 2019; Mu et al, 2020)

Read more

Summary

Introduction

Advances in high-throughput sequencing strategies have spurred microbiome research and revealed important insights into the microbial communities that inhabit human, animal, and environmental habitats (The Human Microbiome Project Consortium, 2012; Oh et al, 2014; Zeevi et al, 2019). Whole metagenomic shotgun sequencing, which allows for a comprehensive analysis of microbial DNA from a sample, has been instrumental in expanding our understanding of the functional potential and genetic composition of different microorganisms that have not been previously cultured. An important step in characterizing organisms that have not. Graph Scaffolds Improve Metagenome Binning been isolated is the reconstruction of their complete genome sequence (Uritskiy and DiRuggiero, 2019; Mu et al, 2020). This process involves assembling short metagenomic reads into longer contiguous sequences (contigs) based on sequence overlap. The uneven abundance of organisms, repetitive sequences within and across genomes, sequencing errors, and strain-level variations within a single sample often contribute to incomplete and fragmented assemblies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call