Sets of presence records used to model species’ distributions typically consist of observations collected opportunistically rather than systematically. As a result, sampling probability is geographically uneven, which may confound the model's characterization of the species’ distribution. Modelers frequently address sampling bias by manipulating training data: either subsampling presence data or creating a similar spatial bias in non‐presence background data. We tested a new method, which we call ‘background thickening’, in the latter category. Background thickening entails concentrating background locations around presence locations in proportion to presence location density. We compared background thickening to two established sampling bias correction methods – target group background selection and presence thinning – using simulated data and data from a case study. In the case study, background thickening and presence thinning performed similarly well, both producing better model discrimination than target group background selection, and better model calibration than models without correction. In the simulation, background thickening performed better than presence thinning when the number of simulated presence locations was low, and vice versa. We discuss drawbacks to target group background selection, why background thickening and presence thinning are conservative but robust sampling bias correction methods, and why background thickening is better than presence thinning for small sample sizes. Particularly, background thickening is advantageous for treating sampling bias when data are scarce because it avoids discarding presence records.
Read full abstract