Abstract

The human gut microbiota performs functions that are essential for the maintenance of the host physiology. However, characterizing the functioning of microbial communities in relation to the host remains challenging in reference-based metagenomic analyses. Indeed, as taxonomic and functional analyses are performed independently, the link between genes and species remains unclear. Although a first set of species-level bins was built by clustering co-abundant genes, no reference bin set is established on the most used gut microbiota catalog, the Integrated Gene Catalog (IGC). With the aim to identify the best suitable method to group the IGC genes, we benchmarked nine taxonomy-independent binners implementing abundance-based, hybrid and integrative approaches. To this purpose, we designed a simulated non-redundant gene catalog (SGC) and computed adapted assessment metrics. Overall, the best trade-off between the main metrics is reached by an integrative binner. For each approach, we then compared the results of the best-performing binner with our expected community structures and applied the method to the IGC. The three approaches are distinguished by specific advantages, and by inherent or scalability limitations. Hybrid and integrative binners show promising and potentially complementary results but require improvements to be used on the IGC to recover human gut microbial species.

Highlights

  • The human gut microbiota represents one of the densest microbial environments harboring approximately 1013 microbial cells [1] and housing a few hundred microbial species in a healthy individual [2], including bacteria, phages, archaea and microeukaryotes

  • In order to assess the suitability of different binning methods to cluster a large non-redundant set of genes such as the Integrated Gene Catalog (IGC) into species-level bins, we selected and applied to our simulated non-redundant gene catalog (SGC) nine taxonomy-independent binners implementing abundance-based (MGS-CANOPY and MSPMINER), hybrid (SOLIDBIN, COCACOLA, CONCOCT, METABAT, MAXBIN and MYCC) and integrative (DAS TOOL) approaches

  • We present the results of our comprehensive evaluation comprising on one hand overall benchmarking results using a gold standard and metrics taking into account the assignment of genes to multiple bins, and on the other hand a deeper exploration of the results of the best-performing binners based on specific definitions, representations and points of comparison allowing to exploit the particularities of our SGC

Read more

Summary

Introduction

The human gut microbiota represents one of the densest microbial environments harboring approximately 1013 microbial cells [1] and housing a few hundred microbial species in a healthy individual [2], including bacteria, phages, archaea and microeukaryotes. The gut microbiota and its host maintain a complex symbiotic relationship known to be strongly associated with host health states and a disruption in the composition of the microbiota is observed in many diseases [6]. Understanding how these different microorganisms play a role in human health is of crucial interest. Characterizing the functioning of microbial communities in relation to the host remains a challenge This is notably due to the small percentage of cultivable microorganisms in the gut microbiota, making the use of quantitative metagenomics indispensable to further explore the composition and diversity of microbial communities by giving access to the genes and genomes of uncultivated species [7]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.