Abstract

The increasing use of species distribution modeling (SDM) has raised new concerns regarding the inaccuracies, misunderstanding, and misuses of this important tool. One of those possible pitfalls − collinearity among environmental predictors − is assumed as an important source of model uncertainty, although it has not been subjected to a detailed evaluation in recent SDM studies. It is expected that collinearity will increase uncertainty in model parameters and decrease statistical power. Here we use a virtual species approach to compare models built using subsets of PCA-derived variables with models based on the original highly correlated climate variables. Moreover, we evaluated whether modelling algorithms and species data characteristics generate models with varying sensitivity to collinearity. As expected, collinearity among predictors decreases the efficiency and increases the uncertainty of species distribution models. Nevertheless, the intensity of the effect varied according to the algorithm properties: more complex procedures behaved better than simple envelope models. This may support the claim that complex models such as Maxent take advantage of existing collinearity in finding the best set of parameters. The interaction of the different factors with species characteristics (centroid and tolerance in environmental space) highlighted the importance of the so-called “idiosyncrasy in species responses” to model efficiency, but differences in prevalence may represent a better explanation. However, even models with low accuracy to predict suitability of individual cells may provide meaningful information on the estimation of range-size, a key species-trait for macroecological studies. We concluded that the use of PCA-derived variables is advised both to control the negative effects of collinearity and as a more objective solution for the problem of variable selection in studies dealing with large number of species with heterogeneous responses to environmental variables.

Highlights

  • The interaction of all of our main explanatory variables were statistically significant for True Skill statistics (TSS), but not for overprediction rate (OP) and underprediction rate (UP) (Table 2)

  • 2,6% of the variation of OP but explains 36% of variation in UP and 56% of variation in the TSS in our experiment. This support our initial use of this covariate, but create a special concern about studies that do not control for this variable in the evaluation of the predictive ability of Species distribution modeling (SDM) procedures

  • No comprehensive test of these effects exists in the literature except Dormann et al [35], who “. . .hesitantly conclude from our analysis that collinearity is a lesser problem than overfitting . . . or data uncertainty”

Read more

Summary

Introduction

Species distribution modeling (SDM) is an interesting and efficient tool to deal with a variety of questions related to species geographic distributions [1,2,3,4]: What is the distribution of rare. The only quantitative evaluation of uncertainty related to collinearity is the use of PCA and sequential regression explored by Dorman [35] in analysis of the distribution of Great Gray Shrike (Lanius excubitor L.) in relation to climate change He found that model collinearity was of minor importance compared to other sources of uncertainty, but the analysis of only one species limits the generalization of these results. Intrinsic characteristics of the species directly affect its range size, prevalence, range geometry, and ecological/geographical marginality, all important features that affect SDM performance [43,44,45,46] To deal with such complexity and to bring more generality to our evaluation, we use the “virtual species” approach [47] controlling key features of modeled species and environmental data. As both approaches encompasses interesting ecological questions, our analysis was designed to deal with the accuracy of predictions for both the species distribution in each cell and of total range-size of modeled species

Methods overview
Results
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call