Abstract

DNA barcoding through the use of amplified regions of the ribosomal operon, such as the 16S gene, is a routine method to gain an overview of the microbial taxonomic diversity within a sample without the need to isolate and culture the microbes present. However, bacterial cells usually have multiple copies of this ribosomal operon, and choosing the ‘wrong’ copy could provide a misleading species classification. While this presents less of a problem for well-characterized organisms with large sequence databases to interrogate, it is a significant challenge for lesser known organisms with unknown copy number and diversity. Using the entire length of the ribosomal operon, which encompasses the 16S, 23S, 5S and internal transcribed spacer regions, should provide greater taxonomic resolution but has not been well explored. Here, we use publicly available reference genomes and explore the theoretical boundaries when using concatenated genes and the full-length ribosomal operons, which has been made possible by the development and uptake of long-read sequencing technologies. We quantify the issues of both copy choice and operon length in a phylogenetic context to demonstrate that longer regions improve the phylogenetic signal while maintaining taxonomic accuracy.

Highlights

  • Microbes are the most numerous organisms on the planet, and some are of great importance to our health and well-being, we do not understand the full diversity of microbes present [1]

  • Shortread shotgun sequencing of amplified regions of 16S, 23S ribosomal RNA genes and the internal transcribed spacer (ITS) region have become a cheap, routine and direct way to gain a high-level understanding of microbial taxonomic diversity within complex samples, such as feces or soil, without requiring culturing of microbes [4,5]

  • By analyzing all paralogous copies at once, we found that even when the full ribosomal RNA operon was available without chimerism or other assembly artifacts, the choice of which genomic copy to analyze affected the phylogenetic inferences

Read more

Summary

Introduction

Microbes are the most numerous organisms on the planet, and some are of great importance to our health and well-being, we do not understand the full diversity of microbes present [1] This is compounded by the fact that only a small number of bacterial species can currently be cultured. Evaluating the short hypervariable regions of the 16S gene gives family-level taxonomic resolution, distinguishing between the common Staphylococcal and Streptococcal pathogens [6]. This approach has limitations as it is expected to detect just previously characterized microbes and those for which we have very little knowledge, using genome regions that are theorized to exist in all microbes. Linking the phylogenetic signal to these short amplified markers can be challenging, so it is important to understand the limitations and potential of current technologies

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call