Piphillin predicts metagenomic composition and dynamics from DADA2-corrected 16S rDNA sequences

Nicole R Narayan,Thomas Weinmaier,Emilio J Laserna-Mendieta,Marcus J Claesson,Fergus Shanahan,Karim Dabbagh,Shoko Iwai,Todd Z Desantis

doi:10.1186/s12864-019-6427-1

Abstract

BackgroundShotgun metagenomic sequencing reveals the potential in microbial communities. However, lower-cost 16S ribosomal RNA (rRNA) gene sequencing provides taxonomic, not functional, observations. To remedy this, we previously introduced Piphillin, a software package that predicts functional metagenomic content based on the frequency of detected 16S rRNA gene sequences corresponding to genomes in regularly updated, functionally annotated genome databases. Piphillin (and similar tools) have previously been evaluated on 16S rRNA data processed by the clustering of sequences into operational taxonomic units (OTUs). New techniques such as amplicon sequence variant error correction are in increased use, but it is unknown if these techniques perform better in metagenomic content prediction pipelines, or if they should be treated the same as OTU data in respect to optimal pipeline parameters.ResultsTo evaluate the effect of 16S rRNA sequence analysis method (clustering sequences into OTUs vs amplicon sequence variant error correction into amplicon sequence variants (ASVs)) on the ability of Piphillin to predict functional metagenomic content, we evaluated Piphillin-predicted functional content from 16S rRNA sequence data processed through OTU clustering and error correction into ASVs compared to corresponding shotgun metagenomic data. We show a strong correlation between metagenomic data and Piphillin-predicted functional content resulting from both 16S rRNA sequence analysis methods. Differential abundance testing with Piphillin-predicted functional content exhibited a low false positive rate (< 0.05) while capturing a large fraction of the differentially abundant features resulting from corresponding metagenomic data. However, Piphillin prediction performance was optimal at different cutoff parameters depending on 16S rRNA sequence analysis method. Using data analyzed with amplicon sequence variant error correction, Piphillin outperformed comparable tools, for instance exhibiting 19% greater balanced accuracy and 54% greater precision compared to PICRUSt2.ConclusionsOur results demonstrate that raw Illumina sequences should be processed for subsequent Piphillin analysis using amplicon sequence variant error correction (with DADA2 or similar methods) and run using a 99% ID cutoff for Piphillin, while sequences generated on platforms other than Illumina should be processed via OTU clustering (e.g., UPARSE) and run using a 96% ID cutoff for Piphillin. Piphillin is publicly available for academic users (Piphillin server. http://piphillin.secondgenome.com/.)

Highlights

Shotgun metagenomic sequencing reveals the potential in microbial communities
16S ribosomal RNA (rRNA) sequence analysis approach impacts the quantity of sequences kept for processing, correlation to metagenomic data, and detection of differentially abundant features Traditionally, 16S rRNA gene sequence data has been analyzed via either clustering sequences to an external reference, clustering sequences to an external reference de novo operational taxonomic units (OTU) clustering on remaining reads, or de novo OTU clustering on all reads
We studied the impact of 16S rRNA gene sequence analysis method (ASV error correction with DADA2 (ASVs) versus 97% de novo OTU clustering using UPARSE (OTUs)) on Piphillin results at varying identity cutoffs

Summary

Introduction

Shotgun metagenomic sequencing reveals the potential in microbial communities. lowercost 16S ribosomal RNA (rRNA) gene sequencing provides taxonomic, not functional, observations. Piphillin (and similar tools) have previously been evaluated on 16S rRNA data processed by the clustering of sequences into operational taxonomic units (OTUs) New techniques such as amplicon sequence variant error correction are in increased use, but it is unknown if these techniques perform better in metagenomic content prediction pipelines, or if they should be treated the same as OTU data in respect to optimal pipeline parameters. Since Piphillin exploits nearest-neighbor matching of 16S rRNA gene sequences to genomic sequence data held in these databases, the significant expansion observed in both collections increases the likelihood of matched candidates. These expansions enhance the integrity and accuracy of predicted genome contents. Considering these significant changes to reference sequence databases, it is necessary to re-assess Piphillin using the same metrics and criteria described in the original paper

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Jan 17, 2020
Citations: 57	License type: open-access

R Discovery Prime

R Discovery Prime

Piphillin predicts metagenomic composition and dynamics from DADA2-corrected 16S rDNA sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

An independent evaluation in a CRC patient cohort of microbiome 16S rRNA sequence analysis methods: OTU clustering, DADA2, and Deblur.
Guang Liu ... Tong Li
Frontiers in Microbiology | VOL. 14
Guang Liu, et. al.Guang Liu ... Tong Li
25 Jul 2023
Frontiers in Microbiology | VOL. 14

A critical analysis of state-of-the-art metagenomics OTU clustering algorithms
Ashaq Hussain Bhat ... Kalpana Balakrishnan
Journal of Biosciences | VOL. 44
Ashaq Hussain Bhat, et. al.Ashaq Hussain Bhat ... Kalpana Balakrishnan
06 Nov 2019
Journal of Biosciences | VOL. 44

Comparing DADA2 and OTU clustering approaches in studying the bacterial communities of atopic dermatitis.
Christopher J Barnes ... Maria Asplund
Journal of Medical Microbiology | VOL. 69
Christopher J Barnes, et. al.Christopher J Barnes ... Maria Asplund
23 Sep 2020
Journal of Medical Microbiology | VOL. 69

Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches.
Jacob T Nearing ... Gavin M Douglas
PeerJ | VOL. 6
Jacob T Nearing, et. al.Jacob T Nearing ... Gavin M Douglas
08 Aug 2018
PeerJ | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Piphillin predicts metagenomic composition and dynamics from DADA2-corrected 16S rDNA sequences

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics