Abstract

One of the most valuable initiatives on massive availability of biodiversity data is the Global Biodiversity Information Facility, which is creating new opportunities to develop and test macroecological knowledge. However, the potential uses of these data are limited by the gaps and biases associated to large-scale distributional databases (the so-called Wallacean shortfall). Describing and quantifying these limitations are essential to improve knowledge on biodiversity, especially in poorly-studied groups, such as mosses. Here we assess the coverage of the publicly-available distributional information on Iberian mosses, defining its eventual biases and gaps. For this purpose, we compiled IberBryo v1.0, a database that comprises 82,582 records after processing and checking the geospatial and taxonomical information. Our results show the limitations of data and metadata of the publicly-available information. Particularly, ca. 42% of the records lacked collecting date information, which limits data usefulness for time coverage analyses and enlarges the existing knowledge gaps. Then we evaluated the overall coverage of several aspects of the spatial, temporal and environmental variability of the Iberian Peninsula. Through this assessment, we demonstrate that the publicly-available information on Iberian mosses presents significant biases. Inventory completeness is strongly conditioned by the recorders' survey bias, particularly in northern Portugal and eastern Spain and the spatial pattern of surveys is also biased towards mountains. Besides, the temporal pattern of survey effort intensifies from 1970 onwards, encompassing a progressive increase in the geographic coverage of the Iberian Peninsula. Although we just found 5% of well-surveyed cells of 30’ of resolution over the 1970-2018 period, they cover about a fifth of the main climatic gradients of the Iberian Peninsula, which provides a fair – though limited – coverage. Yet, the well-surveyed cells are biased towards anthropised areas and some of them are located in areas under intense land-use changes, mainly due to the wood-fires of the last decade. Despite the overall increase, we found a noticeable gap of information in the south-west of Iberia, the Ebro river basin and the inner plateaus. All these gaps and biases call for a careful use of the available distributional data of Iberian mosses for biogeographical and ecological modelling analysis. Further, our results highlight the necessity of incorporating several good practices to increase the coverage of high-quality information. These good practices include digitalisation of specimens and metadata information, improvement on the protocols to get accurate data and metadata or revisions of the vouchers and recorders' field notebooks. These procedures are essential to improve the quality and coverage of the data. Finally, we also encourage Iberian bryologists to establish a series of re-surveys of classical localities that would allow updating the information on the group, as well as to design their future surveys considering the most important information gaps on IberBryo.

Highlights

  • The current massive availability of biodiversity data is creating new opportunities to develop and test macroecological knowledge (Hampton et al 2013, Morueta-Holme and Svenning 2018)

  • We could retrieve 14.91% (14,549) of GBIF records mainly through the assignment of coordinates according to the locality description, while we had to delete 19.15% (18,696) of them due to geospatial errors (Fig. 1, see Suppl. material 2)

  • Scientific names were unified in IberBryo v1.0 into 869 different species from 57 families (857 out of 893 Spanish taxa, 369 out of 522 Portuguese taxa and 207 out of 274 Andorran taxa — totals extracted from Ros et al 2013)

Read more

Summary

Introduction

The current massive availability of biodiversity data is creating new opportunities to develop and test macroecological knowledge (Hampton et al 2013, Morueta-Holme and Svenning 2018). Advances in the management (i.e. acquisition, cleaning and integration) and analysis of ‘biodiversity big data’ are crucial (Gandomi and Haider 2015, Devictor and Bensaude-Vincent 2016), promoting the emergence of new fields such as ecoinformatics and biodiversity informatics (Bisby 2000, Soberón and Peterson 2004). Advances in big biodiversity data tools and computational power are continually increasing the potential offered by this information (Bisby 2000, Maldonado et al 2015, Devictor and Bensaude-Vincent 2016, Wüest et al 2019). The more common limitations of biodiversity data are related to georeferencing and taxonomy (Soberón and Peterson 2004, Wieczorek et al 2004, Yesson et al 2007, Sousa-Baena et al 2014, Isaac and Pocock 2015) and data cleaning processes have an important role in their solution (Chapman 2005, Gandomi and Haider 2015, Maldonado et al 2015, Gueta and Carmel 2016, Calabrese 2019)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call