Abstract

Understanding the spatial pattern of species distributions is fundamental in biogeography, and conservation and resource management applications. Most species distribution models (SDMs) require or prefer species presence and absence data for adequate estimation of model parameters. However, observations with unreliable or unreported species absences dominate and limit the implementation of SDMs. Presence-only models generally yield less accurate predictions of species distribution, and make it difficult to incorporate spatial autocorrelation. The availability of large amounts of historical presence records for freshwater fishes of the United States provides an opportunity for deriving reliable absences from data reported as presence-only, when sampling was predominantly community-based. In this study, we used boosted regression trees (BRT), logistic regression, and MaxEnt models to assess the performance of a historical metacommunity database with inferred absences, for modeling fish distributions, investigating the effect of model choice and data properties thereby. With models of the distribution of 76 native, non-game fish species of varied traits and rarity attributes in four river basins across the United States, we show that model accuracy depends on data quality (e.g., sample size, location precision), species’ rarity, statistical modeling technique, and consideration of spatial autocorrelation. The cross-validation area under the receiver-operating-characteristic curve (AUC) tended to be high in the spatial presence-absence models at the highest level of resolution for species with large geographic ranges and small local populations. Prevalence affected training but not validation AUC. The key habitat predictors identified and the fish-habitat relationships evaluated through partial dependence plots corroborated most previous studies. The community-based SDM framework broadens our capability to model species distributions by innovatively removing the constraint of lack of species absence data, thus providing a robust prediction of distribution for stream fishes in other regions where historical data exist, and for other taxa (e.g., benthic macroinvertebrates, birds) usually observed by community-based sampling designs.

Highlights

  • Understanding species-habitat relationships and the spatial pattern of species distributions is critical in biogeography, biodiversity conservation, and resource management [1, 2]

  • We tested whether incorporating spatial autocorrelation would improve the performance of the species distribution models, using the principal coordinate analysis of neighbor matrices (PCNM) approach [32, 56] in the R package ‘PCNM’ [57]

  • We focused on analyzing boosted regression tree (BRT) models for brevity since the performance of the logistic model agreed in terms of the validation area under the receiver-operating-characteristic curve (AUC) (Fig 2, Table B in S3 File)

Read more

Summary

Introduction

Understanding species-habitat relationships and the spatial pattern of species distributions is critical in biogeography, biodiversity conservation, and resource management [1, 2]. Based on current biological sampling surveys, species distribution models (SDMs) could be used to design conservation or management plans [5,6,7]. NatureServe provides the most up-to-date electronic species distribution maps of US freshwater fauna and flora at the HUC-8 (hydrologic unit 8-digit code) level (http://www.natureserve.org/), but neither species-habitat relationships nor subtle temporal shifts in distribution are discernible from maps at such coarse resolutions. This limitation exists largely because gathering occurrence data by sampling each species’ entire habitat range can be time-consuming and costly [14]. Presence-only models can only estimate realized niche when the assumptions of known prevalence and sampling bias are valid [19], and usually yield less accurate species-habitat associations and species distributions than presence-absence models [14, 20]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.