Abstract
The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in environmental DNA (eDNA) metabarcoding studies, especially for assessing metazoan diversity. Yet, a great number of COI operational taxonomic units (OTUs) or/and amplicon sequence variants (ASVs) retrieved from such studies do not get a taxonomic assignment with a reference sequence. To assess and investigate such sequences, we have developed the Dark mAtteR iNvestigator (DARN) software tool. For this purpose, a reference COI-oriented phylogenetic tree was built from 1,593 consensus sequences covering all the three domains of life. With respect to eukaryotes, consensus sequences at the family level were constructed from 183,330 sequences retrieved from the Midori reference 2 database, which represented 70% of the initial number of reference sequences. Similarly, sequences from 431 bacterial and 15 archaeal taxa at the family level (29% and 1% of the initial number of reference sequences respectively) were retrieved from the BOLD and the PFam databases. DARN makes use of this phylogenetic tree to investigate COI pre-processed sequences of amplicon samples to provide both a tabular and a graphical overview of their phylogenetic assignments. To evaluate DARN, both environmental and bulk metabarcoding samples from different aquatic environments using various primer sets were analysed. We demonstrate that a large proportion of non-target prokaryotic organisms, such as bacteria and archaea, are also amplified in eDNA samples and we suggest prokaryotic COI sequences to be included in the reference databases used for the taxonomy assignment to allow for further analyses of dark matter. DARN source code is available on GitHub at https://github.com/hariszaf/darn and as a Docker image at https://hub.docker.com/r/hariszaf/darn.
Highlights
Metabarcoding: concept and caveatsDNA metabarcoding is a rapidly evolving method that is being more frequently employed in a range of fields, such as biodiversity, biomonitoring, molecular ecology and others (Deiner et al 2017; Ruppert et al 2019)
Building a C oxidase subunit I gene (COI)-oriented reference phylogenetic tree is a challenging task especially considering the small number of microbial curated COI sequences deposited in reference databases; e.g. ~4,000 bacterial and ~150 archaeal sequences in BOLD
To provide a more interactive way of communicating both our approach and our results, we strongly suggest the reader to visit this Google Collab notebook where the building of the reference COI phylogenetic tree is described step-by-step and this GitHub pages site where our results are demonstrated
Summary
Metabarcoding: concept and caveatsDNA metabarcoding is a rapidly evolving method that is being more frequently employed in a range of fields, such as biodiversity, biomonitoring, molecular ecology and others (Deiner et al 2017; Ruppert et al 2019). Environmental DNA (eDNA) metabarcoding, targeting DNA directly isolated from environmental samples (e.g., water, soil or sediment, (Taberlet et al 2012a)), is considered a holistic approach (Stat et al 2017) in terms of biodiversity assessment, providing high detection capacity. Mitochondria are nearly universally present in eukaryotic organisms, especially in case of metazoa, and can be sequenced and used for identification of the species composition of a sample (Taberlet et al 2012b). It is essential that comprehensive public databases containing well curated, up-to-date sequences from voucher specimens are available (Schenekar et al 2020). This way, sequences generated by universal primers can be compared with the ones in reference databases, assessing sample OTU composition. The taxonomy assignment step of the eDNA metabarcoding method and the identification via DNA-barcoding, is only as good and accurate as the reference databases (Cilleros et al 2019)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.