Abstract
Species distribution models are generic empirical techniques that have a number of applications. One of these applications is to determine which environmental conditions are most important for a species. The calculation of this variable importance depends on a number of assumptions, including that the observations that are used to estimate the models are independent of each other. Spatial autocorrelation, which is a common feature most environmental factors confounds this assumption. Besides, many species distribution models are trained using a number of explanatory variables that have different levels of spatial autocorrelation. In this study we quantified the effects of differences in spatial autocorrelation in explanatory variables and the type of species responses to environmental gradients on variable importance estimations in species distribution models. We simulated data for both environmental predictors and species, so that we were in control of the true contribution of every variable in the model and the importance that could be estimated after fitting the models. We found that spatial autocorrelation in the predictors inflated the variable importance estimates, but only when the response of species to the environmental gradients is linear. This inflation effect was larger when the environmental preferences of species coincided with the dominant environmental conditions in a study site. Additionally we find that unimodal responses to the predictors yield systematically a higher variable importance compared to linear responses. We conclude that the type of response to environmental conditions and the relative levels of spatial autocorrelation in the environmental variables cause most bias in relative variable importance estimations. In this way, this study helps to clarify in a systematic and controlled approach how to make proper inferences about variable importance in species distribution models.
Highlights
Species distribution modelling methods have been used frequently by ecologists to define the geographic ranges of species, and to infer the factors determining the realised niche of the species (Peterson and Soberón 2012)
This study clarifies the basic functionalities of species distribution models when faced with autocorrelation in the environmental conditions, different types of species response curve geometry, and the overlap between the niche and environmental conditions
We showed that the shape of the response curve and the overlap between species niche and environmental conditions modify the effects of spatial autocorrelation in environmental conditions on variable importance estimations
Summary
Species distribution modelling methods have been used frequently by ecologists to define the geographic ranges of species, and to infer the factors determining the realised niche of the species (Peterson and Soberón 2012) When making these inferences researchers should be aware that models based on statistical correlations do not necessarily uncover causal mechanisms. To create models that capture relevant relationships, many studies reduce the total list of possible explanatory variables to a shorter array of the most important variables. This is usually accomplished by using measures of variable importance that compute a ranking of all the possible species–environment relationships. The use of p-values, standardised coefficients and partial response curves can help in comparison of variable influence across models (Naimi and Araújo 2016)
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have