Abstract
Sets of presence records used to model species’ distributions typically consist of observations collected opportunistically rather than systematically. As a result, sampling probability is geographically uneven, which may confound the model's characterization of the species’ distribution. Modelers frequently address sampling bias by manipulating training data: either subsampling presence data or creating a similar spatial bias in non‐presence background data. We tested a new method, which we call ‘background thickening’, in the latter category. Background thickening entails concentrating background locations around presence locations in proportion to presence location density. We compared background thickening to two established sampling bias correction methods – target group background selection and presence thinning – using simulated data and data from a case study. In the case study, background thickening and presence thinning performed similarly well, both producing better model discrimination than target group background selection, and better model calibration than models without correction. In the simulation, background thickening performed better than presence thinning when the number of simulated presence locations was low, and vice versa. We discuss drawbacks to target group background selection, why background thickening and presence thinning are conservative but robust sampling bias correction methods, and why background thickening is better than presence thinning for small sample sizes. Particularly, background thickening is advantageous for treating sampling bias when data are scarce because it avoids discarding presence records.
Highlights
Collected presence data harbor vast information about species’distributions, but distribution models based on these data risk mischaracterizing occurrence–environment relationships (Guisan and Zimmermann 2000, Ponder et al.2001)
We examine the effects of background thickening compared to no bias correction and two established bias correction methods: target group background selection and presence thinning
With samples comprising 250 presences, both metrics of performance were highest for models employing presence thinning, followed by models using background thickening, and models without bias correction (Fig. 2)
Summary
Collected presence data harbor vast information about species’distributions, but distribution models based on these data risk mischaracterizing occurrence–environment relationships (Guisan and Zimmermann 2000, Ponder et al.2001). Presence-background distribution models estimate relative presence probability by comparing presence locations (hereafter: ‘presences’) to a background that consists of all locations in the study area: locations where the species is present as well as ‘uninformed background’ locations where its occurrence is unknown (Phillips and Elith 2013, Halvorsen et al 2015) These models are especially vulnerable to the effects of sampling bias, and usually require correction (Phillips et al 2009). Estimated sampling probabilities supplied to the popular Maxent software as a ‘bias file’ are factored out of predictions formally (Phillips et al 2006, Merow et al 2013 Appendix 5, Merow et al 2016) Another frequently employed strategy is to reduce the effects of sampling bias informally, by adjusting the training data (e.g. selecting presences or uninformed background locations under a specific scheme).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.