Prevalence dependence in model goodness measures with special emphasis on true skill statistics.

Imelda Somodi,Nikolett Lepesi,Zoltán Botta‐Dukát

doi:10.1002/ece3.2654

Imelda Somodi, Nikolett Lepesi + Show 1 more

Open Access

https://doi.org/10.1002/ece3.2654

Copy DOI

Abstract

It has long been a concern that performance measures of species distribution models react to attributes of the modeled entity arising from the input data structure rather than to model performance. Thus, the study of Allouche et al. (Journal of Applied Ecology, 43, 1223, 2006) identifying the true skill statistics (TSS) as being independent of prevalence had a great impact. However, empirical experience questioned the validity of the statement. We searched for technical reasons behind these observations. We explored possible sources of prevalence dependence in TSS including sampling constraints and species characteristics, which influence the calculation of TSS. We also examined whether the widespread solution of using the maximum of TSS for comparison among species introduces a prevalence effect. We found that the design of Allouche et al. (Journal of Applied Ecology, 43, 1223, 2006) was flawed, but TSS is indeed independent of prevalence if model predictions are binary and under the strict set of assumptions methodological studies usually apply. However, if we take realistic sources of prevalence dependence, effects appear even in binary calculations. Furthermore, in the widespread approach of using maximum TSS for continuous predictions, the use of the maximum alone induces prevalence dependence for small, but realistic samples. Thus, prevalence differences need to be taken into account when model comparisons are carried out based on discrimination capacity. The sources we identified can serve as a checklist to safely control comparisons, so that true discrimination capacity is compared as opposed to artefacts arising from data structure, species characteristics, or the calculation of the comparison measure (here TSS).

Highlights

Measuring model performance is a central issue in species distribution modeling (SDM, Guisan & Zimmermann, 2000) and predictive vegetation modeling (PVM, Franklin, 1995)
We found a response to prevalence changes in the maximum value of true skill statistics (TSS) for small sample sizes (Figures 2 and 3), which decreased with an increase in sample size and approached the theoretically expected value
The dependence at low sample size had an U-shaped form, implying that the same model goodness can result in higher maximum TSS solely due to a low or high prevalence if sample size is low

Summary

Introduction

Measuring model performance (goodness) is a central issue in species distribution modeling (SDM, Guisan & Zimmermann, 2000) and predictive vegetation modeling (PVM, Franklin, 1995). There are three major tasks performance measures are used for: 1) comparing modeling techniques, typically using one dataset and the same species with each technique (e.g., Jones, Acker, & Halpern, 2010; Zurell et al, 2012), 2) comparing the performance of models of different species with one or more modeling techniques using one dataset (e.g., Coetzee, Robertson, Erasmus, Van Rensburg, & Thuiller, 2009; Engler et al, 2013; Pliscoff, Luebert, Hilger, & Guisan, 2014), and 3) when models of the same species are tested on different datasets (e.g., Randin et al, 2006; Ribeiro, Somodi, & Čarni, 2016) On the contrary, when different species or prediction on different dataset is compared, characteristics of the data (including prevalence) may influence model performance

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Ecology and Evolution	Publication Date: Jan 12, 2017
Citations: 88	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Prevalence dependence in model goodness measures with special emphasis on true skill statistics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Ecology and Evolution

Lead the way for us

Similar Papers

Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS)
Omri Allouche ... Ronen Kadmon
Journal of Applied Ecology | VOL. 43
Omri Allouche, et. al.Omri Allouche ... Ronen Kadmon
12 Sep 2006
Journal of Applied Ecology | VOL. 43

Without quality presence–absence data, discrimination metrics such as TSS can be misleading measures of model performance
Boris Leroy ... Robin Delsol
Journal of Biogeography | VOL. 45
Boris Leroy, et. al.Boris Leroy ... Robin Delsol
02 Jul 2018
Journal of Biogeography | VOL. 45

How does spatial resolution affect model performance? A case for ensemble approaches for marine benthic mesophotic communities
Joseph A. Turner ... Russell C. Babcock
Journal of Biogeography | VOL. 46
Joseph A. Turner, et. al.Joseph A. Turner ... Russell C. Babcock
02 May 2019
Journal of Biogeography | VOL. 46

Predicting Species Distributions from Samples Collected along Roadsides
Kyle P Mccarthy ... Christopher T Rota
Conservation Biology | VOL. 26
Kyle P Mccarthy, et. al.Kyle P Mccarthy ... Christopher T Rota
19 Oct 2011
Conservation Biology | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prevalence dependence in model goodness measures with special emphasis on true skill statistics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Ecology and Evolution