Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics.

Léa Siegwald,David Hot,Ségolène Caboche,Hélène Touzet,Christophe Audebert,Yves Lemoine

doi:10.1371/journal.pone.0169563

Abstract

Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of a pipeline for targeted metagenomics analyses. This article proposes an evaluation protocol containing real and simulated targeted metagenomics datasets, and adequate metrics allowing us to study the impact of different variables on the biological interpretation of results. This protocol was used to compare six different bioinformatics pipelines in the basic user context: Three common ones (mothur, QIIME and BMP) based on a clustering-first approach and three emerging ones (Kraken, CLARK and One Codex) using an assignment-first approach. This study surprisingly reveals that the effect of sequencing errors has a bigger impact on the results that choosing different amplified regions. Moreover, increasing sequencing throughput increases richness overestimation, even more so for microbiota of high complexity. Finally, the choice of the reference database has a bigger impact on richness estimation for clustering-first pipelines, and on correct taxa identification for assignment-first pipelines. Using emerging assignment-first pipelines is a valid approach for targeted metagenomics analyses, with a quality of results comparable to popular clustering-first pipelines, even with an error-prone sequencing technology like Ion Torrent. However, those pipelines are highly sensitive to the quality of databases and their annotations, which makes clustering-first pipelines still the only reliable approach for studying microbiomes that are not well described.

Highlights

Metagenomics based on high-throughput sequencing (HTS) helps biologists unveil a large part of the constitutive microorganisms of a microbiota
Computational approaches to analyze targeted metagenomics data have been developed in parallel with the popularization of this new application
The first tools like DOTUR (Schloss, 2005) clustered sequences into Operational Taxonomic Unit (OTU) based on the genetic distances between sequences

Summary

Introduction

Metagenomics based on high-throughput sequencing (HTS) helps biologists unveil a large part of the constitutive microorganisms of a microbiota. Shotgun metagenomics usually considers the entire genomic content of a sample, by extracting and sequencing the total DNA. As a result, this comprehensive approach offers a rich picture of a microbiota, and provides the opportunity to simultaneously explore the taxonomic and functional diversity of microbial communities [6]. Shotgun metagenomics is still very expensive and the data analysis is a challenging task, due both to the size and the complex structure of the data [7] This is a significant obstacle to common applications

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Jan 4, 2017
Citations: 61	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Validation and optimization of the Ion Torrent S5 XL sequencer and Oncomine workflow for BRCA1 and BRCA2 genetic testing.
Saeam Shin ... Seung-Tae Lee
Oncotarget | VOL. 8
Saeam Shin, et. al.Saeam Shin ... Seung-Tae Lee
03 Apr 2017
Oncotarget | VOL. 8

Comparison of somatic variant detection algorithms using Ion Torrent targeted deep sequencing data
Qing Wang ... Kyriaki Papadopoulou
BMC Medical Genomics | VOL. 12
Qing Wang, et. al.Qing Wang ... Kyriaki Papadopoulou
01 Dec 2019
BMC Medical Genomics | VOL. 12

Ion torrent high throughput mitochondrial genome sequencing (HTMGS).
N R Harvey ... J R Connell
PloS one | VOL. 14
N R Harvey, et. al.N R Harvey ... J R Connell
15 Nov 2019
PloS one | VOL. 14

A survey on advanced machine learning and deep learning techniques assisting in renewable energy generation.
Sri Revathi B
Environmental Science and Pollution Research | VOL. 30
Sri Revathi BSri Revathi B
08 Aug 2023
Environmental Science and Pollution Research | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE