Abstract

Next-generation sequencing has provided powerful tools to conduct microbial ecology studies. Analysis of community composition relies on annotated databases of curated sequences to provide taxonomic assignments; however, these databases occasionally have errors with implications for downstream analyses. Systemic taxonomic errors were discovered in Greengenes database (v13_5 and 13_8) related to orders Vibrionales and Alteromonadales. These orders have family level annotations that were erroneous at least one taxonomic level, e.g., 100% of sequences assigned to the Pseudoalteromonadaceae family were placed improperly in Vibrionales (rather than Alteromonadales) and >20% of these sequences were indeed Vibrio spp. but were improperly assigned to the Pseudoalteromonadaceae family (rather than to Vibrionaceae). Use of this database is common; we identified 68 peer-reviewed papers since 2013 that likely included erroneous annotations specifically associated with Vibrionales and Pseudoalteromonadaceae, with 20 explicitly stating the incorrect taxonomy. Erroneous assignments using these specific versions of Greengenes can lead to incorrect conclusions, especially in marine systems where these taxa are commonly encountered as conditionally rare organisms and potential pathogens.

Highlights

  • Analysis of 16S rRNA gene sequences has dramatically changed the way microbiologists understand the ecology of whole bacterial communities in an ecosystem

  • The analyses reported by Edgar (2018b) suggest broader issues with taxonomic mismatches among others (Beiko, 2016; Kozlov et al, 2016; Balvociute & Huson, 2017), we assessed the extent of the misclassification associated with order Vibrionales and family Pseudoalteromonadaceae, given the importance of these taxa in marine systems

  • ribosomal database project (RDP) and NCBI identified 45, and SILVA 43, of the 164 sequences as Vibrio spp. that were incorrectly assigned to the Pseudoalteromonadaceae family in Greengenes; because Greengenes assigned all members of the Pseudoalteromonadaceae family to the Vibrionales order, these 43–46 sequences were only incorrect at the family level (Fig. 1)

Read more

Summary

Introduction

Analysis of 16S rRNA gene sequences has dramatically changed the way microbiologists understand the ecology of whole bacterial communities in an ecosystem. We are able to sequence millions of reads of this gene in mixed samples to understand changes and dynamics in microbial composition. To analyze these data, sequence reads are compared against a curated ribosomal sequence database with known taxonomic identities. Used databases include Greengenes (DeSantis et al, 2006; McDonald et al, 2012; http://greengenes.secondgenome.com), SILVA (Pruesse et al, 2007), How to cite this article Lydon and Lipp (2018), Taxonomic annotation errors incorrectly assign the family Pseudoalteromonadaceae to the order Vibrionales in Greengenes: implications for microbial community assessments. Greengenes is one of the smallest databases, it has been suggested as the preferred database for classification of taxonomy because of its capacity to assign taxonomy to great depth (e.g., species level identification) (Werner et al, 2012)

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.