Belowground invertebrate communities are dominated by species-rich and very small microarthropods that require long handling times and high taxonomic expertise for species determination. Molecular based methods like metabarcoding circumvent the morphological determination process by assigning taxa bioinformatically based on sequence information. The potential to analyse diverse and cryptic communities in short time at high taxonomic resolution is promising. However, metabarcoding studies revealed that taxonomic assignment below family-level in Collembola (Hexapoda) and Oribatida (Acariformes) is difficult and often fails. These are the most abundant and species-rich soil-living microarthropods, and the application of molecular-based, automated species determination would be most beneficial in these taxa. In this study, we analysed the presence of a barcoding gap in the standard barcoding gene cytochrome oxidase I (COI) in Collembola and Oribatida. The barcoding gap describes a significant difference between intra- and interspecific genetic distances among taxa and is essential for bioinformatic taxa assignment. We collected COI sequences of Collembola and Oribatida from BOLD and NCBI and focused on species with a wide geographic sampling to capture the range of their intraspecific variance. Our results show that intra- and interspecific genetic distances in COI overlapped in most species, impeding accurate assignment. When a barcoding gap was present, it exceeded the standard threshold of 3% intraspecific distances and also differed between species. Automatic specimen assignments also showed that most species comprised of multiple genetic lineages that caused ambiguous taxon assignments in distance-based methods. Character-based taxonomic assignment using phylogenetic trees and monophyletic clades as criteria worked for some species of Oribatida but failed completely for Collembola. Notably, parthenogenetic species showed lower genetic variance in COI and more accurate species assignment than sexual species. The different patterns in genetic diversity among species suggest that the different degrees of genetic variance result from deep evolutionary distances. This indicates that a single genetic threshold, or a single standard gene, will probably not be sufficient for the molecular species identification of many Collembola and Oribatida taxa. Our results also show that haplotype diversity in some of the investigated taxa was not even nearly covered, but coverage was better for Collembola than for Oribatida. Additional use of secondary barcoding genes and long-read sequencing of marker genes can improve metabarcoding studies. We also recommend the construction of pan-genomes and pan-barcodes of species lacking a barcoding gap. This will allow both to identify species boundaries, and to cover the full range of variability in the marker genes, making molecular identification also possible for species with highly diverse barcode sequences.
Read full abstract