We investigated the global distribution patterns and pangenomic diversity of the candidate phylum "Latescibacteria" (WS3) in 16S rRNA gene as well as metagenomic data sets. We document distinct distribution patterns for various "Latescibacteria" orders in 16S rRNA gene data sets, with prevalence of orders sediment_1 in terrestrial, PBSIII_9 in groundwater and temperate freshwater, and GN03 in pelagic marine, saline-hypersaline, and wastewater habitats. Using a fragment recruitment approach, we identified 68.9 Mb of "Latescibacteria"-affiliated contigs in publicly available metagenomic data sets comprising 73,079 proteins. Metabolic reconstruction suggests a prevalent saprophytic lifestyle in all "Latescibacteria" orders, with marked capacities for the degradation of proteins, lipids, and polysaccharides predominant in plant, bacterial, fungal/crustacean, and eukaryotic algal cell walls. As well, extensive transport and central metabolic pathways for the metabolism of imported monomers were identified. Interestingly, genes and domains suggestive of the production of a cellulosome-e.g., protein-coding genes harboring dockerin I domains attached to a glycosyl hydrolase and scaffoldin-encoding genes harboring cohesin I and CBM37 domains-were identified in order PBSIII_9, GN03, and MSB-4E2 fragments recovered from four anoxic aquatic habitats; hence extending the cellulosomal production capabilities in Bacteria beyond the Gram-positive Firmicutes In addition to fermentative pathways, a complete electron transport chain with terminal cytochrome c oxidases Caa3 (for operation under high oxygen tension) and Cbb3 (for operation under low oxygen tension) were identified in PBSIII_9 and GN03 fragments recovered from oxygenated and partially/seasonally oxygenated aquatic habitats. Our metagenomic recruitment effort hence represents a comprehensive pangenomic view of this yet-uncultured phylum and provides insights broader than and complementary to those gained from genome recovery initiatives focusing on a single or few sampled environments.IMPORTANCE Our understanding of the phylogenetic diversity, metabolic capabilities, and ecological roles of yet-uncultured microorganisms is rapidly expanding. However, recent efforts mainly have been focused on recovering genomes of novel microbial lineages from a specific sampling site, rather than from a wide range of environmental habitats. To comprehensively evaluate the genomic landscape, putative metabolic capabilities, and ecological roles of yet-uncultured candidate phyla, efforts that focus on the recovery of genomic fragments from a wide range of habitats and that adequately sample the intraphylum diversity within a specific target lineage are needed. Here, we investigated the global distribution patterns and pangenomic diversity of the candidate phylum "Latescibacteria" Our results document the preference of specific "Latescibacteria" orders to specific habitats, the prevalence of plant polysaccharide degradation abilities within all "Latescibacteria" orders, the occurrence of all genes/domains necessary for the production of cellulosomes within three "Latescibacteria" orders (GN03, PBSIII_9, and MSB-4E2) in data sets recovered from anaerobic locations, and the identification of the components of an aerobic respiratory chain, as well as occurrence of multiple O2-dependent metabolic reactions in "Latescibacteria" orders GN03 and PBSIII_9 recovered from oxygenated habitats. The results demonstrate the value of phylocentric pangenomic surveys for understanding the global ecological distribution and panmetabolic abilities of yet-uncultured microbial lineages since they provide broader and more complementary insights than those gained from single-cell genomic and/or metagenomic-enabled genome recovery efforts focusing on a single sampling site.
Read full abstract