Abstract

SummaryThe recent prevalence of high-throughput sequencing has been producing numerous prokaryotic community structure datasets. Although the trait-based approach is useful to interpret those datasets from ecological perspectives, available trait information is biased toward culturable prokaryotes, especially those of clinical and public health relevance, and thus may not represent the breadth of microbiota found across many of Earth's environments. To facilitate habitat-based analysis free of such bias, here we report a ready-to-use prokaryotic habitat database, ProkAtlas. ProkAtlas comprehensively links 16S rRNA gene sequences to prokaryotic habitats, using public shotgun metagenome datasets. We also developed a computational pipeline for habitat-based analysis of given prokaryotic community structures. After confirmation of the method effectiveness using 16S rRNA gene sequence datasets from individual genomes and the Earth Microbiome Project, we showed its validness and effectiveness in drawing ecological insights by applying it to six empirical prokaryotic community datasets from soil, aquatic, and human gut samples.

Highlights

  • In the era of high-throughput sequencing, huge numbers of prokaryotic community structure datasets are being produced by 16S rRNA gene amplicon and shotgun metagenomic sequencing methods (Ramirez et al, 2018; Thompson et al, 2017)

  • After confirmation of the method effectiveness using 16S rRNA gene sequence datasets from individual genomes and the Earth Microbiome Project, we showed its validness and effectiveness in drawing ecological insights by applying it to six empirical prokaryotic community datasets from soil, aquatic, and human gut samples

  • We developed a database named ProkAtlas that links 16S rRNA gene sequences mined from metagenomes to prokaryotic habitats by substantially extending MetaMetaDB, which was previously developed by our group (Haider et al, 2018; Yang and Iwasaki, 2014)

Read more

Summary

Introduction

In the era of high-throughput sequencing, huge numbers of prokaryotic community structure datasets are being produced by 16S rRNA gene amplicon and shotgun metagenomic sequencing methods (Ramirez et al, 2018; Thompson et al, 2017). Genome size (Barberan et al, 2014), rRNA gene copy number (Nemergut et al, 2016), growth rate, stress tolerance, capability to acquire carbon sources (Malik et al, 2020), metabolic potential (Louca et al, 2016), and pigmentation (Choudoir et al, 2018) data have been adopted for trait-based analyses of prokaryotic community structure datasets. These approaches performed well, their trait data were limited and biased to cultured prokaryotes with available genomic and physiological data. Prokaryotic communities, contain approximately 80% to more than 90% of uncultured members (Schloss et al, 2016; Steen et al, 2019)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call