Abstract

Recovering genomes from shotgun metagenomic sequence data allows detailed taxonomic and functional characterization of individual species or strains in a microbial community. Retrieving these metagenome-assembled genomes (MAGs) involves seven stages. First, low-quality bases, along with adapter and host sequences, are removed. Second, overlapping sequences are assembled to create longer contiguous fragments. Third, these fragments are clustered based on sequence composition and abundance. Fourth, these sequence clusters, or bins, undergo rounds of quality assessment and refinement to yield MAGs. The optional fifth stage is dereplication of MAGs to select representatives. Next, each MAG is taxonomically classified. The optional seventh stage is assessing the fraction of diversity that has been recovered. The output of this protocol is draft genomes, which can provide invaluable clues about uncultured organisms. This protocol takes ~1 week to run, depending on computational resources available, and requires prior experience with high-performance computing, shell script programming and Python.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call