Abstract

Assessment of biodiversity using metabarcoding data, such as from bulk or environmental DNA sampling, is becoming increasingly relevant in ecology, biodiversity sciences and monitoring. Thereby, the taxonomic identification of species from their DNA sequences relies strongly on reference databases that link genetic sequences to taxonomic names. These databases vary in completeness and availability, depending on the taxonomic group studied and the genetic region targeted. The incompleteness of reference databases is an important argument to explain the nondetection by metabarcoding of species supposedly present. However, there exist further and generally overlooked problems with reference databases that can lead to false or inaccurate inferences of taxonomic assignment. Here, we synthesize all possible problems inherent to reference databases. In particular, we identify a complete, mutually nonexclusive list of seven classes of challenges when it comes to selecting, developing and using a reference database for taxonomic assignment. These are: (i) mislabelling, (ii) sequencing errors, (iii) sequence conflict, (iv) taxonomic conflict, (v) low taxonomic resolution, (vi) missing taxa and (vii) missing intraspecific variants. For each problem identified, we provide a description of possible consequences on the taxonomic assignment process. We illustrate the respective problem with examples taken from the literature or obtained by quantitative analyses of public databases, such as GenBank or BOLD. Finally, we discuss possible solutions to the identified problems and how to navigate them. Only by raising users' awareness of the limitations of metabarcoding data and DNA reference databases will adequate interpretations of these data be achieved.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.