Abstract

DNA barcoding and metabarcoding are techniques that focus on signature genomic regions that in theory provide species level resolution, but in practice this is not always possible. We place animal-focused COI metabarcoding in context with respect to the use of marker gene sequencing in microbial and fungal ecology. We focus on three specific aspects of metabarcodes: 1) the process of metabarcode sequence clustering, 2) how metabarcode cluster types affect the results of biodiversity analyses, and 3) we discuss the current state of reference sequence databases used for metabarcode identification. Using examples from the arthropod COI metabarcode literature, we show that exact sequence variants (ESVs) detect more unique taxa than operational taxonomic units (OTUs) but with similar patterns in taxonomic resolution. We also show that the difference between ordinations based on ESVs or OTUs recover similar groupings. We compile a list of reference sequence databases useful for multi-marker metabarcoding and present a list of reference sequence databases specifically formatted for use with a naive Bayesian classifier for rigorous metabarcode taxonomic assignments. Sophisticated tools and reference databases are available for analyzing COI sequences and these compare favorably with those available for other metabarcode markers such as the ribosomal RNA genes used to target microbes and fungi.

Highlights

  • The objective of DNA barcoding is to permit specimen identification to the species rank

  • We have shown that alpha diversity, richness, is sensitive both to choice of metabarcode cluster type and primer choice, but what does this mean for beta diversity? For arthropods sampled using c oxidase subunit I (COI) metabarcoding from freshwater or soil samples, beta diversity assessments have been shown to be robust to both variations in primer choice and sampling method (Hajibabaei et al, 2019; Porter et al, 2019)

  • Does this hold true for differences in clustering strategy and resolution of the matrix? In our research we have found that beta diversity estimates are robust to the use of either exact sequence variants (ESVs) or operational taxonomic units (OTUs) (Figure 4)

Read more

Summary

BACKGROUND

The objective of DNA barcoding is to permit specimen identification to the species rank. To create an OTU × sample table containing read numbers, primer-trimmed paired sequences can be aligned to each OTU centroid sequence in the database This step may require numerous parameters to be chosen such as the identity threshold, for example, 0.97, to retain sequences with at least 97% sequence similarity to an OTU centroid sequence. To create an ESV x sample table containing read numbers, primer-trimmed paired sequences can be aligned to each unique ESV sequence in the database This step may require numerous parameters to be chosen such as the identity threshold of 1.0 to retain sequences with at least 100% sequence similarity to a denoised ESV sequence. We reanalyzed the data from a study that used COI metabarcoding to assess invertebrates directly from forest soils and directly compared the data reanalyzed two ways: FIGURE 1 | ESVs detect more unique taxa than OTUs, but both reveal similar patterns in taxonomic resolution.

C Procrustes
References*
CONCLUDING REMARKS
Findings
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call