Abstract

Biodiversity databases are increasingly available and have fostered accelerated advances in many disciplines within ecology and evolution. However, the quality of the evidence generated depends critically on the quality of the input data, and species misidentifications are present in virtually any occurrence dataset. Yet, the lack of automatized tools makes the assessment of the quality of species identification in big datasets time-consuming, which often induces researchers to assume that all species are reliably identified. In this study, we address this issue by evaluating how species misidentification can impact our ability to capture ecological patterns, and by presenting an R package, called naturaList, designed to classify species occurrence data according to identification reliability. naturaList allows the classification of species occurrences up to six confidence levels, in which the highest level is assigned to records identified by specialists. We obtained a list of specialists by using the species occurrence dataset itself, based on the identifier names within it, and by entering an independent list, obtained by contacting experts. Further, we evaluate the effects of filtering out occurrence records not identified by specialists on the estimations of species niche and diversity patterns. We used the tribe Myrteae (Myrtaceae) as a study model, which is a species-rich group in Central and South America and with challenging taxonomy. We found a significant change in species niche in 13% of species when using only occurrences identified by specialists. We found changes in patterns of alpha diversity in four genera and changes in beta diversity in all genera analyzed. We show how the uncertainty in species identification in occurrence datasets affects conclusions on macroecological patterns by generating bias or noise in different aspects of macroecological patterns (niche, alpha, and beta diversity). Therefore, to guarantee reliability in species identification in big data sets we recommend the use of automated tools such as the naturaList package, especially when analyzing variation in species composition. This study also represents a step forward to increasing the quality of large-scale studies that rely on species occurrence data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.