Abstract

Digitized species occurrence data provide an unprecedented source of information for ecologists and conservationists. Species distribution model (SDM) has become a popular method to utilise these data for understanding the spatial and temporal distribution of species, and for modelling biodiversity patterns. Our objective is to study the impact of noise in species occurrence data (namely sample size and positional accuracy) on the performance and reliability of SDM, considering the multiplicative impact of SDM algorithms, species specialisation, and grid resolution. We created a set of four ‘virtual’ species characterized by different specialisation levels. For each of these species, we built the suitable habitat models using five algorithms at two grid resolutions, with varying sample sizes and different levels of positional accuracy. We assessed the performance and reliability of the SDM according to classic model evaluation metrics (Area Under the Curve and True Skill Statistic) and model agreement metrics (Overall Concordance Correlation Coefficient and geographic niche overlap) respectively. Our study revealed that species specialisation had by far the most dominant impact on the SDM. In contrast to previous studies, we found that for widespread species, low sample size and low positional accuracy were acceptable, and useful distribution ranges could be predicted with as few as 10 species occurrences. Range predictions for narrow-ranged species, however, were sensitive to sample size and positional accuracy, such that useful distribution ranges required at least 20 species occurrences. Against expectations, the MAXENT algorithm poorly predicted the distribution of specialist species at low sample size.

Highlights

  • Understanding spatio-temporal distribution patterns of species is fundamental for ecology, conservation, biogeography, and many environmental studies

  • Minimum sample size of species occurrences required for Species distribution model (SDM) prediction Our results revealed inconsistencies between the evaluation and agreement metrics regarding the minimum sample size of species occurrences required for SDM

  • Though Generalized Linear Model (GLM), Generalized Boosted Model (GBM), and Random Forest (RF) required minimum 20 species occurrences to successfully model the distribution ranges for generalist and restricted generalist species, only five species occurrences were required for successful modelling of relaxed specialist and specialist species (Fig 2 and Figs B–D in S1 Appendix)

Read more

Summary

Introduction

Understanding spatio-temporal distribution patterns of species is fundamental for ecology, conservation, biogeography, and many environmental studies. Species distribution model (SDM) allows for predictions of species distributions by quantifying relationships between species occurrence and associated environmental conditions [1,2,3]. One of the most widely used classes of SDM is a presence-background, which has been used in roughly 53% of SDM studies published between 2008 and 2014 [1]. This presence-background model compares the environmental conditions at the locations a species was recorded ( referred to as the ‘species occurrence’) to other points (background or pseudo-absence), distributed throughout the environmental domain [1]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call