Abstract

16S rRNA gene analysis is the most convenient and robust method for microbiome studies. Inaccurate taxonomic assignment of bacterial strains could have deleterious effects as all downstream analyses rely heavily on the accurate assessment of microbial taxonomy. The use of mock communities to check the reliability of the results has been suggested. However, often the mock communities used in most of the studies represent only a small fraction of taxa and are used mostly as validation of sequencing run to estimate sequencing artifacts. Moreover, a large number of databases and tools available for classification and taxonomic assignment of the 16S rRNA gene make it challenging to select the best-suited method for a particular dataset. In the present study, we used authentic and validly published 16S rRNA gene type strain sequences (full length, V3-V4 region) and analyzed them using a widely used QIIME pipeline along with different parameters of OTU clustering and QIIME compatible databases. Data Analysis Measures (DAM) revealed a high discrepancy in ratifying the taxonomy at different taxonomic hierarchies. Beta diversity analysis showed clear segregation of different DAMs. Limited differences were observed in reference data set analysis using partial (V3-V4) and full-length 16S rRNA gene sequences, which signify the reliability of partial 16S rRNA gene sequences in microbiome studies. Our analysis also highlights common discrepancies observed at various taxonomic levels using various methods and databases.

Highlights

  • Next-Generation Sequencing (NGS) techniques are capable of generating high quality, comparable data [1]

  • Considering the parameters mentioned above, it was expected that number of Operational Taxonomic Units (OTUs) in the Data Analysis Measures (DAM) would be similar to the sample data set

  • OTUs picking, the number of OTUs obtained in each variation is significantly different

Read more

Summary

Introduction

Next-Generation Sequencing (NGS) techniques are capable of generating high quality, comparable data [1]. 16S rRNA gene based analysis remains to be the gold standard, proper precautions need to be taken during sequencing, preprocessing of data, and subsequent downstream analyses. The selection of variable region, choice of method for OTU clustering, selection of reference databases, and sequencing platform has been shown to play an important role in the assessment of microbial diversity [5,6]. It is known that sequencing errors with different sequencing platforms could reduce the reliability of the analysis [9]. Another limitation is that the taxonomic assignment is dependent on the reference database

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call