Abstract

ABSTRACTAim The proportion of sampled sites where a species is present is known as prevalence. Empirical studies have shown that prevalence can affect the predictive performance of species distribution models. This paper uses simulated species data to examine how prevalence and the form of species environmental dependence affect the assessment of the predictive performance of models.Methods Simulated species data were based on various functions of simulated environmental data with differing degrees of spatial correlation. Seven model performance measures – sensitivity, specificity, class‐average (CA), overall prediction success, kappa (Îș), normalized mutual information (NMI) and area under the receiver operating characteristic curve (AUC) – were applied to species models fitted by three regression methods. The response of the performance measures to prevalence was then assessed. Three probability threshold selection methods used to convert fitted logistic model values to presence or absence were also assessed.Results The study shows that the extent to which prevalence affects model performance depends on the modelling technique and its degree of success in capturing dominant environmental determinants. It also depends on the statistic used to measure model performance and the probability threshold method. The response based on Îș generally preferred models with medium prevalence. All performance measures were least affected by prevalence when the probability threshold was chosen to maximize predictive performance or was based directly on prevalence. In these cases, the responses based on AUC, CA and NMI generally preferred models with small or large prevalence.Main conclusions The effect of prevalence on the predictive performance of species distribution models has a methodological basis. Relevant factors include the success of the fitted distribution model in capturing the dominant environmental determinant, the model performance measure and the probability threshold selection method. The fixed probability threshold method yields a marked response of model performance to prevalence and is therefore not recommended. The study explains previous empirical results obtained with real data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.