Labelling strategies in metabarcoding studies & how to ensure that nucleotide tags stay in place Metabarcoding of environmental DNA (eDNA) and DNA extracted from bulk specimen samples is a powerful tool in studies of ecological interactions, diet and biodiversity, as its labelling of amplicons allows high-throughput sequencing of taxonomically informative DNA sequences from many samples in parallel. The backbone of metabarcoding is the addition of sample-specific nucleotide identifiers to amplicons and then following sequencing using these to assign metabarcoding sequences to the samples they originated from. This allows the pooling of hundreds to thousands of samples before sequencing and thereby full utilisation of the capacity of high-throughput sequencing platforms. The nucleotide identifiers can be added both during the metabarcoding PCR and during library preparation, i.e. when amplicons are prepared for sequencing. There are three main strategies with which to achieve nucleotide labelling in metabarcoding studies. One commonly used strategy is the so-called tagged PCR approach in which DNA extracts are individually amplified with metabarcoding primers that carry sample-specific nucleotide tags at the 5’ end. The uniquely tagged products are then pooled and a library prepared on the pool of amplicons. However, tag‐jumps have been documented in this commonly used metabarcoding approach (Schnell et al. 2015). Tag-jumps cause nucleotide tags to switch between amplicons, resulting in occurrence of amplicons that carry different tags than originally applied. Sequences in the sequencing output that carry tag combinations not used in the study design are easily identified and excluded. However, sequences carrying incorrect, but already used, tag combinations will cause incorrect assignments of sequences to samples. This can - much to the detriment of metabarcoding studies - lead to false positives and artificial inflation of diversity in the samples (Schnell et al. 2015). The occurrence of tag-jumps has led to recommendations to only carry out metabarcoding PCR amplifications with primers carrying twin-tags to ensure that tag‐jumps cannot result in false assignments of sequences to samples (Schnell et al. 2015). However, this increases both cost and workload of metabarcoding studies. In a recently published article, we demonstrate a tag-jump free single-tube library preparation protocol for Illumina sequencing specifically designed for 5’ nucleotide tagged amplicons, the Tagsteady protocol (Carøe & Bohmann 2020). We designed the Tagsteady protocol to circumvent the two steps during library preparation of pools of 5ʹ nucleotide-tagged amplicons that had previously been suggested to cause tag-jumps; i) T4 DNA polymerase blunt-ending in the end-repair step, and ii) post-ligation PCR amplification of amplicon libraries. We used pools of twin‐tagged amplicons to investigate the effect of these two steps on the occurrence of tag‐jumps. Doing this, we demonstrated that blunt‐ending and post-ligation PCR, alone or together, can result in high proportions of tag-jumps, in our study up to ca. 49% of total sequences. The Tagsteady protocol where both these steps were left out resulted in tag‐jump levels comparable to background contamination (Carøe & Bohmann 2020). In our study, we encourage practitioners to avoid using T4 DNA polymerase blunt‐ending and post-ligation PCR in library preparation of 5’ nucleotide tagged amplicon pools, for example by using the Tagsteady protocol (Carøe & Bohmann 2020). This will enable efficient and cost-effective generation of metabarcoding data with correct assignment of sequences to samples. References Carøe C, Bohmann K (2020) Tagsteady: A metabarcoding library preparation protocol to avoid false assignment of sequences to samples. Molecular Ecology Resources, 20, 1620–1631. Schnell IB, Bohmann K, Gilbert MTP (2015) Tag jumps illuminated - reducing sequence-to-sample misidentifications in metabarcoding studies. Molecular Ecology Resources, 15, 1289–1303.
Read full abstract