The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

Mindy M Syfert,Matthew J Smith,David A Coomes

doi:10.1371/journal.pone.0055158

Mindy M Syfert, Matthew J Smith + Show 1 more

Open Access

PDF Available

https://doi.org/10.1371/journal.pone.0055158

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Species distribution models (SDMs) trained on presence-only data are frequently used in ecological research and conservation planning. However, users of SDM software are faced with a variety of options, and it is not always obvious how selecting one option over another will affect model performance. Working with MaxEnt software and with tree fern presence data from New Zealand, we assessed whether (a) choosing to correct for geographical sampling bias and (b) using complex environmental response curves have strong effects on goodness of fit. SDMs were trained on tree fern data, obtained from an online biodiversity data portal, with two sources that differed in size and geographical sampling bias: a small, widely-distributed set of herbarium specimens and a large, spatially clustered set of ecological survey records. We attempted to correct for geographical sampling bias by incorporating sampling bias grids in the SDMs, created from all georeferenced vascular plants in the datasets, and explored model complexity issues by fitting a wide variety of environmental response curves (known as “feature types” in MaxEnt). In each case, goodness of fit was assessed by comparing predicted range maps with tree fern presences and absences using an independent national dataset to validate the SDMs. We found that correcting for geographical sampling bias led to major improvements in goodness of fit, but did not entirely resolve the problem: predictions made with clustered ecological data were inferior to those made with the herbarium dataset, even after sampling bias correction. We also found that the choice of feature type had negligible effects on predictive performance, indicating that simple feature types may be sufficient once sampling bias is accounted for. Our study emphasizes the importance of reducing geographical sampling bias, where possible, in datasets used to train SDMs, and the effectiveness and essentialness of sampling bias correction within MaxEnt.

Highlights

Species distribution models (SDMs), which predict a species’ probability of occurrence across a landscape by relating documented locations of that species to environmental information, are frequently used in ecological, environmental and climate change research [1,2,3,4,5]
Correcting bias in the herbarium and National Vegetation Survey databank (NVS) datasets led to dramatic increases in Area Under the Curve (AUC) and COR values when model predictions were compared with observed tree fern presences and absences in the independent Land Use and Carbon Analysis System (LUCAS) dataset (Table 1)
Correcting for geographical sampling bias approximately halved the false-absence and false-presence error rates of distribution maps predicted with the NVS dataset (Table 2; Figure S1), and approximately halved the false absence rate of distribution maps predicted with the herbarium dataset, paradoxically the false presence rate increased following the correction (Table 2)

Summary

Introduction

Species distribution models (SDMs), which predict a species’ probability of occurrence across a landscape by relating documented locations of that species to environmental information, are frequently used in ecological, environmental and climate change research [1,2,3,4,5]. There is a ready supply of environmental information, including global databases of climate and digital elevation models [9] and user-friendly software packages. These technological advances mean that, as never before, SDMs are being used in ecological research and conservation planning. This paper explores the consequences of correcting for geographical sampling bias and non-automatically selecting model functional forms on the predictive ability of MaxEnt, one of the best performing species distribution modelling techniques for analysis of presence-only data [10,11,12,13]

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Feb 14, 2013
Citations: 442	License type: CC BY 4.0

R Discovery Prime

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Species-specific tuning increases robustness to sampling bias in models of species distributions: An implementation with Maxent
Robert P Anderson ... Israel Gonzalez
Ecological Modelling | VOL. 222
Robert P Anderson, et. al.Robert P Anderson ... Israel Gonzalez
26 May 2011
Ecological Modelling | VOL. 222

Is geographic sampling bias representative of environmental space?
Francesca Cosentino ... Luigi Maiorano
Ecological Informatics | VOL. 64
Francesca Cosentino, et. al.Francesca Cosentino ... Luigi Maiorano
17 Jul 2021
Ecological Informatics | VOL. 64

Sampling bias in presence-only data used for species distribution modelling: theory and methods for detecting sample bias and its effects on models
Bente Støa ... Sabrina Mazzoni
Sommerfeltia | VOL. 38
Bente Støa, et. al.Bente Støa ... Sabrina Mazzoni
01 Oct 2018
Sommerfeltia | VOL. 38

Performance tradeoffs in target‐group bias correction for species distribution models
Nathan Ranc ... Anders Angerbjörn
Ecography | VOL. 40
Nathan Ranc, et. al.Nathan Ranc ... Anders Angerbjörn
17 Oct 2016
Ecography | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: PLoS ONE