Abstract

Correlative species distribution models (SDMs) are increasingly being used to predict suitable insect habitats. There is also much criticism of prediction discrepancies among different SDMs for the same species and the lack of effective communication about SDM prediction uncertainty. In this paper, we undertook a factorial study to investigate the effects of various modeling components (species-training-datasets, predictor variables, dimension-reduction methods, and model types) on the accuracy of SDM predictions, with the aim of identifying sources of discrepancy and uncertainty. We found that model type was the major factor causing variation in species-distribution predictions among the various modeling components tested. We also found that different combinations of modeling components could significantly increase or decrease the performance of a model. This result indicated the importance of keeping modeling components constant for comparing a given SDM result. With all modeling components, constant, machine-learning models seem to outperform other model types. We also found that, on average, the Hierarchical Non-Linear Principal Components Analysis dimension-reduction method improved model performance more than other methods tested. We also found that the widely used confusion-matrix-based model-performance indices such as the area under the receiving operating characteristic curve (AUC), sensitivity, and Kappa do not necessarily help select the best model from a set of models if variation in performance is not large. To conclude, model result discrepancies do not necessarily suggest lack of robustness in correlative modeling as they can also occur due to inappropriate selection of modeling components. In addition, more research on model performance evaluation is required for developing robust and sensitive model evaluation methods. Undertaking multi-scenario species-distribution modeling, where possible, is likely to mitigate errors arising from inappropriate modeling components selection, and provide end users with better information on the resulting model prediction uncertainty.

Highlights

  • Various species distribution models have been used to predict suitable insect habitats

  • The multivariate analysis of variance (MANOVA) results (Table 3) showed that all modeling components and their interactions had a significant effect on the linear combination of the five model performance scores (Kappa, areacross-validation under the receiver curve (AUC), sensitivity, specificity, and CV-error) with the exception of predictor choice (P)

  • This study shows that the predictive performance of different model types depends mainly on data pre-processing, in other words, on pseudo-absence dataset development, dimension-reduction method, species- and predictor-dataset selection

Read more

Summary

Introduction

Various species distribution models have been used to predict suitable insect habitats. For example, the bioclimatic analysis and prediction system, BIOCLIM [15] and the point-to-point similarity metric system, DOMAIN [16], have been reported to be suitable for predicting the distribution of rare species occupying a limited environmental niche, representing simple linear interactions among environmental variables. Complex models, such as Support Vector Machines (SVMs) [17] and Artificial Neural Networks (ANNs), with complex functions that consider non-linearity and a large number of variables can handle complex interactions within a multidimensional variable space [5,18]. Jiménez-Valverde, et al [19] and Chefaoui and

Objectives
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.