Human-associated microbial communities are a complex mixture of bacterial species and diverse strains prevalent at varying abundances. Due to the inherent limitations of metagenomic assemblers and genome binning tools in recovering low-abundance species (<1%) and strains, we lack comprehensive insight into these communities. Although many bioinformatics approaches are available for recovering metagenome-assembled genomes, their effectiveness in recovering low-abundance species and strains is often questioned. Moreover, each tool has its trade-offs, making selecting the right tools challenging. In this study, we investigated the combinatory effect of various assemblers and binning tools on the recovery of low-abundance species and strain-resolved genomes from real and simulated human metagenomes. We evaluated the performance of nine combinations of metagenome assemblers and genome binning tools for their potential to recover genomes of useable quality. Our results revealed that the metaSPAdes-MetaBAT2 combination is highly effective in recovering low-abundance species, while MEGAHIT-MetaBAT2 excels in recovering strain-resolved genomes. These findings highlight the significant variation in the performance of different combinations, even when aiming for the same objective. This suggests the profound impact of selecting the right assembler-binner combination for metagenome analyses. We believe this study will be a cornerstone for the scientific community, guiding the choice of tools by highlighting their complementary effects. Furthermore, it underscores the potential of existing tools to address the current challenges in the field improving the recovery of information from metagenomes.
Read full abstract