Abstract

Culture-independent analysis of microbial communities frequently relies on amplification and sequencing of the prokaryotic 16S ribosomal RNA gene. Typical analysis pipelines group sequences into operational taxonomic units (OTUs) to infer taxonomic and phylogenetic relationships. Here, we present HmmUFOtu, a novel tool for processing microbiome amplicon sequencing data, which performs rapid per-read phylogenetic placement, followed by phylogenetically informed clustering into OTUs and taxonomy assignment. Compared to standard pipelines, HmmUFOtu more accurately and reliably recapitulates microbial community diversity and composition in simulated and real datasets without relying on heuristics or sacrificing speed or accuracy.

Highlights

  • Culture-independent amplification, sequencing, and analysis of phylogenetic marker genes, such as the prokaryotic 16S ribosomal RNA gene, enables community-wide analysis of the diversity and composition of host-associated and environmental microbiota

  • When comparing the Operational taxonomic unit (OTU) tables generated by each analysis to the theoretical composition of genus-level mock community taxa, we found that HmmUFOtu closely recapitulated the compositions of the reference community, especially for the V4 mock dataset (Fig. 5a, b)

  • By aligning the generated representative sequences to the known genomic sequences of the mock community bacteria (Additional file 1: Table S2) using NCBI blastn program [9], we found that the HmmUFOtu’s consensus based rep-seqs show a higher sequence similarity to the reference genomes compared to the QIIME-default method that uses the first read (“first”) in each OTU (Fig. 6a, Kruskal–Wallis test; p → 0 for both V4 and V1 V3 datasets), suggesting the consensus of all observed sequences generally better represents the true bacterial target gene sequences by aggregating information across multiple reads and/or samples

Read more

Summary

Introduction

Culture-independent amplification, sequencing, and analysis of phylogenetic marker genes, such as the prokaryotic 16S ribosomal RNA (rRNA) gene, enables community-wide analysis of the diversity and composition of host-associated and environmental microbiota These approaches heavily rely upon computational methods to cluster amplicon sequences into groups representing putatively conspecific sequences (operational taxonomic units [OTUs]) and to infer taxonomic and phylogenetic relationships [1]. Together, these methods are widely used for profiling microbial communities; many serious practical concerns arise when considering their application to growing collections of 16S rRNA amplicon sequences

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call