Improving plant DNA metabarcoding accuracy with ecological filters and Angiosperms353: Field and pollen microscopy validation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

PremiseMetabarcoding has become a successful tool for the identification of species in ecological assemblages. However, the usefulness of metabarcoding for identifying plant species has been hampered due to a lack of universal gene regions that work across all taxa, limiting the applications of metabarcoding in ecology.MethodsHere, we outline a spatiotemporal approach that combines Angiosperms353 baits with species distribution models and phenological analyses to generate a list of candidate species to increase metabarcoding accuracy. To evaluate the ecological realism of our framework, we compared the results of DNA metabarcoding pollen loads of wild bumble bees to long‐term field observations of bee–plant interactions and visual pollen identification.ResultsWe show that metabarcoding bumble bee pollen loads was most accurate when combined with a candidate taxa list of plants flowering when the bumble bees were foraging, which improved the accuracy and taxonomic precision of 77.5% of samples relative to non‐filtered matches.DiscussionWith the proliferation of species occurrence and phenology data and advances in computing and software, spatiotemporal filtering provides an improved approach for interpreting metabarcoding results. Additionally, we demonstrate that Angiosperms353 offers significant promise for metabarcoding projects to reveal species interactions.

Similar Papers
  • Research Article
  • 10.1002/ece3.72733
Pollen Foraging by Bumble Bee Queens During a Critical Nesting Period Revealed by DNA Metabarcoding
  • Dec 1, 2025
  • Ecology and Evolution
  • Kelsey Schoenemann + 3 more

ABSTRACTThe nest‐founding stage represents an especially vulnerable period of the bumble bee (Bombus) life cycle, during which solitary queens must locate and collect sufficient foraging resources to sustain themselves and their brood. Yet, we lack contemporary information about floral foraging resources used by queens in early spring. Here, we use next‐generation sequencing to characterize the floral species used by queens for pollen provisions during early nest establishment. We collected pollen loads from over 100 wild bumble bee queens at working farms, rural and city parks, and nature preserves across the Piedmont region of Virginia, USA. Using metabarcoding of two universal DNA barcodes for plants, ITS2 and rbcL, we determined the taxonomic composition of pollen used by queens. Pollen loads contained native and non‐native woody (e.g., Cercis: Fabaceae, Prunus: Rosaceae, Salix: Salicaceae), herbaceous (e.g., Lamium: Lamiaceae, Viola: Violaceae), and vine (e.g., Lonicera: Caprifoliaceae) taxa. The non‐native Lamium and Elaeagnus (Elaeagnaceae) most frequently hosted foraging queens, owing in part to their abundance across sites and the season. Pollen composition varied more over time than among bumble bee species or across sites, but land cover predicted a small amount of variation in pollen composition. Specifically, the percentage of crop land within 1 km increased the representation of Lamium in queen pollen loads, likely reflecting the abundance of the disturbance‐adapted flower in fallow cornfields. Finally, the pollen communities detected by rbcL were twice as diverse as those by ITS2, perhaps owing to the better taxonomic resolution afforded by the fast‐evolving rbcL marker. This study demonstrates that queens are flexible foragers and that among the most common Bombus species, plant phenology drives pollen use more than species identity. Further, this study highlights the importance of monitoring pollen diets to inform regional management strategies and considerations about metabarcoding techniques.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s13592-021-00867-5
The effects of commercial propagation on bumble bee (Bombus impatiens) foraging and worker body size
  • Aug 3, 2021
  • Apidologie
  • Genevieve Pugesek + 2 more

Bumble bees (Bombus spp.) have been commercially propagated for over three decades. As the environmental conditions experienced by commercial bumble bees differ greatly from those experienced by wild bumble bees, commercial rearing of bumble bees may cause phenotypic changes. Here, we compare the foraging behavior and size of worker bumble bees (Bombus impatiens) from commercial and wild colonies. For this experiment, we measured worker body size, recorded if the workers returned with pollen, and examined the contents of pollen loads via microscopy. We found that, while commercial and wild bumble bees foraged on similar communities of flowers, wild bumble bees returned to colonies with purer pollen baskets (higher proportion of the most common species) and were more likely to return to the colony with pollen than their commercial counterparts. Commercial bumble bees were also smaller than wild bees. Our work highlights differences between commercial and wild bumble bees, in addition to raising important unanswered questions about the mechanism and drivers of these differences.

  • Research Article
  • Cite Count Icon 120
  • 10.1111/jbi.13608
Testing species assemblage predictions from stacked and joint species distribution models
  • Jun 5, 2019
  • Journal of Biogeography
  • Damaris Zurell + 5 more

AimPredicting the spatial distribution of species assemblages remains an important challenge in biogeography. Recently, it has been proposed to extend correlative species distribution models (SDMs) by taking into account (a) covariance between species occurrences in so‐called joint species distribution models (JSDMs) and (b) ecological assembly rules within the SESAM (spatially explicit species assemblage modelling) framework. Yet, little guidance exists on how these approaches could be combined. We, thus, aim to compare the accuracy of assemblage predictions derived from stacked and from joint SDMs.LocationSwitzerland.TaxonBirds, tree species.MethodsBased on two monitoring schemes (national forest inventory and Swiss breeding bird atlas), we built SDMs and JSDMs for tree species (at 100 m resolution) and forest birds (at 1 km resolution). We tested accuracy of species assemblage and richness predictions on holdout data using different stacking procedures and ecological assembly rules.ResultsDespite minor differences, results were consistent between birds and tree species. Cross‐validated species‐level model performance was generally higher in SDMs than JSDMs. Differences in species richness and assemblage predictions were larger between stacking procedures and ecological assembly rules than between stacked SDMs and JSDMs. On average, predictions were slightly better for stacked SDMs compared to JSDMs, probabilistic stacks outperformed binary stacks, and ecological assembly rules yielded best predictions.Main conclusionsWhen predicting the composition of species assemblages, the choice of stacking procedure and ecological assembly rule seems more decisive than differences in underlying model type (SDM vs. JSDM). JSDMs do not seem to improve community predictions compared to SDMs or improve predictions for rare species. Still, JSDMs may provide additional insights into community assembly and may help deriving hypotheses about prevailing biotic interactions in the system. We provide simple rules of thumb for choosing appropriate modelling pathways. Future studies should test these preliminary guidelines for other taxa and biogeographical realms as well as for other JSDM algorithms.

  • Research Article
  • Cite Count Icon 34
  • 10.1111/mec.16112
Capabilities and limitations of using DNA metabarcoding to study plant-pollinator interactions.
  • Aug 29, 2021
  • Molecular Ecology
  • Katherine A Arstingstall + 6 more

Many pollinator populations are experiencing declines, emphasizing the need for a better understanding of the complex relationship between bees and flowering plants. Using DNA metabarcoding to describe plant-pollinator interactions eliminates many challenges associated with traditional methods and has the potential to reveal a more comprehensive understanding of foraging behaviour and pollinator life history. Here we use DNA metabarcoding of ITS2 and rbcL gene regions to identify plant species present in pollen loads of 404 bees from three habitats in eastern Oregon. Our specific objectives were to (i) determine whether plant species identified using DNA metabarcoding are consistent with plant species identified using observations, (ii) compare characterizations of diet breadth derived from foraging observations to those based on plant species assignments obtained using DNA metabarcoding, and (iii) compare plant species assignments produced by DNA metabarcoding using a "regional" reference database to those produced using a "local" database. At the three locations, 31%-86% of foraging observations were consistent with DNA metabarcoding data, 8%-50% of diet breadth characterizations based on observations differed from those based on DNA metabarcoding data, and 22%-25% of plant species detected using the regional database were not known to occur in the study area in question. Plant-pollinator networks produced from DNA metabarcoding data had higher sampling completeness and significantly lower specialization than networks based on observations. Here, we examine some strengths and limitations of using DNA metabarcoding to identify plant species present in bee pollen loads, make ecological inferences about foraging behaviour and provide guidance for future research.

  • Research Article
  • Cite Count Icon 19
  • 10.1016/j.scitotenv.2023.166214
Honey bees and bumble bees may be exposed to pesticides differently when foraging on agricultural areas
  • Aug 9, 2023
  • Science of The Total Environment
  • Elena Zioga + 2 more

In an agricultural environment, where crops are treated with pesticides, bees are likely to be exposed to a range of chemical compounds in a variety of ways. The extent to which different bee species are affected by these chemicals, largely depends on the concentrations and type of exposure. We quantified the presence of selected pesticide compounds in the pollen of two different entomophilous crops; oilseed rape (Brassica napus) and broad bean (Vicia faba). Sampling was performed in 12 sites in Ireland and our results were compared with the pollen loads of honey bees and bumble bees actively foraging on those crops in those same sites. Detections were compound specific, and the timing of pesticide application in relation to sampling likely influenced the final residue contamination levels. Most detections originated from compounds that were not recently applied on the fields, and samples from B. napus fields were more contaminated compared to those from V. faba fields. Crop pollen was contaminated only with fungicides, honey bee pollen loads contained mainly fungicides, while more insecticides were detected in bumble bee pollen loads. The highest number of compounds and most detections were observed in bumble bee pollen loads, where notably, all five neonicotinoids assessed (acetamiprid, clothianidin, imidacloprid, thiacloprid, and thiamethoxam) were detected despite the no recent application of these compounds on the fields where samples were collected. The concentrations of neonicotinoid insecticides were positively correlated with the number of wild plant species present in the bumble bee-collected pollen samples, but this relationship could not be verified for honey bees. The compounds azoxystrobin, boscalid and thiamethoxam formed the most common pesticide combination in pollen. Our results raise concerns about potential long-term bee exposure to multiple residues and question whether honey bees are suitable surrogates for pesticide risk assessments for all bee species.

  • Research Article
  • 10.1111/1365-2664.70163
Uncertainty in blacklisting potential Pacific plant invaders using species distribution models
  • Sep 13, 2025
  • Journal of Applied Ecology
  • Valén Holle + 8 more

Invasive alien species pose a growing threat to global biodiversity, underscoring the need for evidence‐based prevention strategies. Species distribution models ( SDMs ) are a widely used tool to estimate the potential distribution of alien species and to inform blacklists based on establishment risk. Yet, data limitations and modelling decisions can introduce uncertainty in these predictions. Here, we aim to quantify the contribution of four key sources of uncertainty in SDM ‐based blacklists: species occurrence data, environmental predictors, SDM algorithms and thresholding methods for binarising predictions. Focusing on 82 of the most invasive plant species on the Hawaiian Islands, we built SDMs to quantify their establishment potential in the Pacific region. To assess uncertainty, we systematically varied four modelling components: species occurrence data (native vs. global), environmental predictors (climatic vs. edapho‐climatic), four SDM algorithms and three thresholding methods. From these models, we derived blacklists using three alternative blacklisting definitions and quantified the variance in establishment risk scores and resulting species rankings attributable to each source of uncertainty. SDMs showed fair predictive performance overall. Among the sources of uncertainty, the thresholding method had the strongest and most consistent influence on risk scores across all three blacklist definitions but resulted in only minor changes in blacklist rankings. Algorithm choice had the most pronounced effect on blacklist rankings, followed by smaller but important effects of species occurrence data and environmental predictors. Notably, models based only on native occurrences often underestimated establishment potential. Synthesis and applications . SDMs can provide valuable support for planning the preventive management of alien species. However, our findings show that blacklist outcomes are highly sensitive to modelling decisions. While ensemble modelling across multiple algorithms is a recommended best practice, our results reinforce the importance of incorporating global occurrence data when available and carefully evaluating the trade‐offs of including additional environmental predictors. Given the strong influence of thresholding on risk scores, we emphasise the need for transparent, context‐specific threshold selection. More broadly, explicitly assessing uncertainty in SDM outputs can improve the robustness of blacklists and support scientifically informed, precautionary decision‐making, particularly in data‐limited situations where pragmatic modelling choices must be taken.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 37
  • 10.1371/journal.pone.0050353
The Abundance and Pollen Foraging Behaviour of Bumble Bees in Relation to Population Size of Whortleberry (Vaccinium uliginosum)
  • Nov 27, 2012
  • PLoS ONE
  • Carolin Mayer + 4 more

Habitat fragmentation can have severe effects on plant pollinator interactions, for example changing the foraging behaviour of pollinators. To date, the impact of plant population size on pollen collection by pollinators has not yet been investigated. From 2008 to 2010, we monitored nine bumble bee species (Bombus campestris, Bombus hortorum s.l., Bombus hypnorum, Bombus lapidarius, Bombus pascuorum, Bombus pratorum, Bombus soroensis, Bombus terrestris s.l., Bombus vestalis s.l.) on Vaccinium uliginosum (Ericaceae) in up to nine populations in Belgium ranging in size from 80 m2 to over 3.1 ha. Bumble bee abundance declined with decreasing plant population size, and especially the proportion of individuals of large bumble bee species diminished in smaller populations. The most remarkable and novel observation was that bumble bees seemed to switch foraging behaviour according to population size: while they collected both pollen and nectar in large populations, they largely neglected pollen collection in small populations. This pattern was due to large bumble bee species, which seem thus to be more likely to suffer from pollen shortages in smaller habitat fragments. Comparing pollen loads of bumble bees we found that fidelity to V. uliginosum pollen did not depend on plant population size but rather on the extent shrub cover and/or openness of the site. Bumble bees collected pollen only from three plant species (V. uliginosum, Sorbus aucuparia and Cytisus scoparius). We also did not discover any pollination limitation of V. uliginosum in small populations. We conclude that habitat fragmentation might not immediately threaten the pollination of V. uliginosum, nevertheless, it provides important nectar and pollen resources for bumble bees and declining populations of this plant could have negative effects for its pollinators. The finding that large bumble bee species abandon pollen collection when plant populations become small is of interest when considering plant and bumble bee conservation.

  • Research Article
  • Cite Count Icon 5
  • 10.3390/insects12100922
Bumble Bee Foraged Pollen Analyses in Spring Time in Southern Estonia Shows Abundant Food Sources.
  • Oct 9, 2021
  • Insects
  • Anna Bontšutšnaja + 3 more

Simple SummaryPollinators make a strong contribution to ecosystem stability. However, nowadays, they also need protection and sustainable habitat to live and develop. Not all regions can provide suitable habitats due to agricultural intensification, urbanization, climate changes and corresponding impacts. Our study was conducted in the late spring in south Estonia where arable lands were surrounded by forest patches and rural areas. For better performance, we used both light microscopy and DNA metabarcoding methods for pollen identification. We found that bumble bees foraged on the diverse food sources showing preferences for several main plant families. Additionally, in our case, land-use types did not show important effects on bumble bee food choices and foraging decisions. Various landscape features can provide diverse food sources at the early development stages and support nest longevity. Here, we can say that a better understanding of pollinators’ food preferences can help in the application of more suitable measures for their conservation. Agricultural landscapes usually provide higher quantities of single-source food, which are noticeably lacking in diversity and might thus have low nutrient value for bumble bee colony development. Here, in this study, we analysed the pollen foraging preferences over a large territory of a heterogeneous agricultural landscape: southern Estonia. We aimed to assess the botanical diversity of bumble bee food plants in the spring time there. We looked for preferences for some food plants or signs of food shortage that could be associated with any particular landscape features. For this purpose, we took Bombus terrestris commercial hives to the landscape, performed microscopy analyses and improved the results with the innovative DNA metabarcoding technique to determine the botanical origin of bumble bee-collected pollen. We found high variability of forage plants with no strong relationship with any particular landscape features. Based on the low number of plant species in single flights, we deduce that the availability of main forage plants is sufficient indicating rich forage availabilities. Despite specific limitations, we saw strong correlations between microscopy and DNA metabarcoding data usable for quantification analyses. As a conclusion, we saw that the spring-time vegetation in southern Estonia can support bumble bee colony development regardless of the detailed landscape structure. The absence of clearly dominating food preference by the tested generalist bumble bee species B. terrestris makes us suggest that other bumble bee species, at least food generalists, should also find plenty of forage in their early development phase.

  • Research Article
  • 10.1093/jisesa/ieaf095
Bacterial communities of wild bee species and the western honey bee (Apis mellifera) (Hymenoptera: Apoidea): Alpine insights
  • Nov 7, 2025
  • Journal of Insect Science
  • Fabian P Royer + 5 more

Wild bees are decreasing in species diversity and populations due to human impact. The abundance of the western honey bee (Apis mellifera L.) experiences an inverse trend, enhancing competition with wild bees and the probability of microbiome exchange. Addressing this exchange, we studied the gut microbiome composition of wild and honey bees, focusing on patterns indicating honey bee influence. Three solitary wild bee species (large scabious mining bee [Andrena hattorfiana F.], grey-backed mining bee (Andrena vaga Panzer), and European orchard bee [Osmia cornuta Latreille]) as well as bumble bees as representatives of eusocial wild bees (Bombus spp. Latreille) and honey bees were sampled in the Austrian Alps. Subsequent 16S ribosomal DNA sequencing revealed the composition of the bacterial communities. The bee groups differed concerning their bacterial composition, with honey bees having the least variation among individuals and a low number of exclusive bacterial taxa and bumble bees the highest bacterial diversity. High honey bee densities corresponded with lower bacterial diversity in wild bees and a higher bacterial similarity between wild and honey bees. Some bacterial taxa were found for the first time in the studied bee groups. Furthermore, the composition of bacterial communities differed between solitary and social bees. We found the first hints that high honey bee density negatively impacts wild bees through alterations of wild bee microbiomes. Future studies should focus on understanding microbiome transmission mechanisms and their consequences for wild bees. Suggestions on how to consider wild bee fitness are indispensable in halting the biodiversity crisis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 17
  • 10.3390/agronomy10091413
Analysis of Pollination Services Provided by Wild and Managed Bees (Apoidea) in Wild Blueberry (Vaccinium angustifolium Aiton) Production in Maine, USA, with a Literature Review
  • Sep 17, 2020
  • Agronomy
  • Sara L Bushmann + 1 more

Maine is the largest producer of wild blueberry (Vaccinium angustifolium Aiton) in the United States. Pollination comes from combinations of honey bees (Apis mellifera (L.)), commercial bumble bees (Bombus impatiens Cresson), and wild bees. This study addresses (1) previous research addressing wild-blueberry pollination, (2) effects of wild-bee and honey-bee activity densities on fruit set, yield, and crop value, (3) the economic value of wild-bee communities, and (4) economic consequences of pollinator loss. Bee communities were sampled in 40 fields over three years (2010–2012) and bee activity densities were estimated for bumble bees, honey bees, and other wild bees. These data were applied to an economic model to estimate the value of bee taxa. Bumble bees and honey bees predicted fruit set and reduced its spatial heterogeneity. Other wild bees were not significant predictors of fruit set. Yield was predicted by fruit set and field size, but not pest management tactics. Our analysis showed that disruption in supply of honey bees would result in nearly a 30% decrease in crop yield, buffered in part by wild bees that provide “background” levels of pollination. Honey-bee stocking density and, thus, the activity density of honey bees was greater in larger fields, but not for wild bees. Therefore, a decrease in crop yield would be greater than 30% for large fields due to the proportionally greater investment in honey bees in large fields and a relatively lower contribution by wild bees.

  • Research Article
  • Cite Count Icon 61
  • 10.1016/j.agee.2019.106792
Foraging of honey bees in agricultural landscapes with changing patterns of flower resources
  • Dec 19, 2019
  • Agriculture, Ecosystems & Environment
  • Svenja Bänsch + 4 more

Foraging of honey bees in agricultural landscapes with changing patterns of flower resources

  • Research Article
  • Cite Count Icon 7
  • 10.1093/aesa/say046
A Preliminary Assessment of Bumble Bee (Hymenoptera: Apidae) Habitat Suitability Across Protected and Unprotected Areas in the Philippines
  • Nov 29, 2018
  • Annals of the Entomological Society of America
  • Jonathan B Koch + 1 more

The Philippines is a biodiversity hotspot and is home to thousands of endemic species, including at least two understudied bumble bee species: Bombus flavescens Smith, 1952 and Bombus irisanensis Cockerell, 1910. Since the 1990s, there have been virtually no studies published on the biology, ecology, and taxonomy of Philippine bumble bees—evidence of the dearth of basic entomological investigations on these important insects. In this preliminary study, our objective is to briefly summarize the geographic distribution of bumble bee habitat suitability (HS) in the Philippines across protected and unprotected areas. Maximum entropy species distribution models (SDMs) of B. flavescens and B. irisanensis were constructed using 19 unique occurrence records and 11 bioclimatic variables to estimate HS in the Philippines. Our SDMs estimated that minimum HS for B. flavescens and B. irisanensis covers ~28,066 and ~24,603 km2 of the 114 protected land parcels in the Philippines, respectively. Across unprotected areas, our SDMs estimated that minimum HS for B. flavescens and B. irisanensis covers ~146,063 and ~156,674 km2, respectively. As predicted, high-elevation habitats have the highest HS relative to low-elevation habitats (r = 0.61, P = 0.003). While our SDMs predicts an extensive distribution of both species across both protected and unprotected areas, it is important to note that nearly 80% of the Philippines is deforested. Our study identifies high-elevation protected areas as places where bumble bees may still thrive, and survey effort should be prioritized to these places to determine the status of Philippine bumble bees.

  • Research Article
  • 10.1128/aem.02036-24
Bumble bee gut microbial community structure differs between species and commercial suppliers, but metabolic potential remains largely consistent
  • Mar 19, 2025
  • Applied and Environmental Microbiology
  • Michelle Z Hotchkiss + 2 more

Bumble bees are key pollinators for natural and agricultural plant communities. Their health and performance are supported by a core gut microbiota composed of a few bacterial taxa. However, the taxonomic composition and community structure of bumble bee gut microbiotas can vary with bee species, environment, and origin (i.e., whether colonies come from the wild or a commercial rearing facility), and it is unclear whether metabolic capabilities therefore vary as well. Here we used metagenomic sequencing to examine gut microbiota community composition, structure, and metabolic potential across bumble bees from two different commercial Bombus impatiens suppliers, wild B. impatiens, and three other wild bumble bee species sampled from sites within the native range of all four species. We found that the community structure of gut microbiotas varied between bumble bee species, between populations from different origins within species, and between commercial suppliers. Notably, we found that Apibacter is consistently present in some wild bumble bee species-suggesting it may be a previously unrecognized core phylotype of bumble bees-and that commercial B. impatiens colonies can lack core phylotypes consistently found in wild populations. However, despite variation in community structure, the high-level metabolic potential of gut microbiotas was largely consistent across all hosts, including for metabolic capabilities related to host performance, though metabolic activity remains to be investigated.IMPORTANCEOur study is the first to compare genome-level taxonomic structure and metabolic potential of whole bumble bee gut microbiotas between commercial suppliers and between commercial and wild populations. In addition, we profiled the full gut microbiotas of three wild bumble bee species for the first time. Overall, our results provide new insight into bumble bee gut microbiota community structure and function and will help researchers evaluate how well studies conducted in one bumble bee population will translate to other populations and species. Research on taxonomic and metabolic variation in bumble bee gut microbiotas across species and origins is of increasing relevance as we continue to discover new ways that social bee gut microbiotas influence host health, and as some bumble bee species decline in range and abundance.

  • Research Article
  • 10.3897/biss.9.183023
Cubification of Biodiversity Data: FAIRiCUBE and the European Habitat Classification System
  • Dec 23, 2025
  • Biodiversity Information Science and Standards
  • Susanna Ioni + 4 more

European habitats are classified under a framework developed by the European Topic Centre for Biodiversity for the European Environment Agency, as part of the European Nature Information System (EUNIS) (Davies et al. 2004). All terrestrial, freshwater, and marine habitats follow a hierarchical classification based on physical features, human influence, and dominant vegetation (Moss 2008, Chytrý et al. 2020). Distribution maps are provided and modelled using occurrence data of indicator species collected from vegetation surveys (Hennekens 2017). Although the system may seem accurate, when we first plotted the distribution of the main species of our habitat study case, EUNIS Habitat S22 ‘Alpine and subalpine ericoid heath’ (European Environment Agency 2019), we observed that occurrence data, e.g., from sources like the Global Biodiversity Information Facility (GBIF), often fell outside the mapped areas of the habitat. Furthermore, important occurrence data sources, such as herbaria, were left out of the official distribution mapping, representing, in our view, a significant shortcoming of the EUNIS system. This study addresses these gaps by integrating diverse sources of in situ occurrence data (herbaria, vegetation surveys, citizen science) through a machine learning approach to complement the current EUNIS mapping. Specifically, we modelled the distributions of diagnostic species of the Habitat S22, using species distribution models (SDMs). For this purpose, we retrieved occurrence data from GBIF, identified by the accepted names as well as taxonomic synonyms, using the R package rgbif (Chamberlain et al. 2025), and utilised the Darwin Core (Wieczorek et al. 2012) standard. Data were filtered to include European occurrences with spatial coordinates and uncertainty of <500 m, and only spring and summer months of 1980–2024. For modelling itself, they were stratified into a 1-km grid. As SDM predictors, we used proxies for macroclimate and topography. Climatic predictors included CHELSA Bioclim variables of mean annual temperature, temperature seasonality, annual precipitation, precipitation seasonality, and an aridity index (Zomer et al. 2022). For topography, we used the digital terrain model, Copernicus, and calculated slope and indices for heat load (McCune and Keon 2002), topographical ruggedness (Riley et al. 1999), and topographical wetness (Beven and Kirkby 1979), using the spatialEco R package (Evans and Murphy 2021) and SAGA GIS (Conrad et al. 2015). Data were integrated into data cubes, and correlations among species occurrences and predictors were tested. We supplemented the occurrence data with pseudo-absences sampled within a buffer around presence points (Fallgatter et al. 2025). We fitted ensemble SDMs weighted by true-skill statistics scores based on independent cross-validation. We modelled two spatial resolutions in two regions: continental Europe at 1-km resolution, and the European Alps at 100-m resolution. continental Europe at 1-km resolution, and the European Alps at 100-m resolution. Predicted species distributions were aggregated into cumulative distribution maps. Those were further validated by overlapping them with the distribution of the habitat based on vegetation plots classified by an expert system as provided by the European Vegetation Archive (EVA) plots at 1-km resolution. Predictions were also compared with the official EUNIS probability map for Habitat S22. Correlation analyses confirmed the ecological features of the Habitat S22 indicated by the EUNIS classification. Our modelled ranges largely overlapped with the distribution of EVA plots and the EUNIS probability map, but also revealed mismatches at lower elevations and in the Scandinavian region. These differences decreased when fewer species were combined in cumulative predictions. Our findings show that SDMs based on occurrence data from different sources can validate and refine expert-defined habitat maps, offering a complementary and data-driven approach.

  • Research Article
  • Cite Count Icon 24
  • 10.1111/ecog.07294
Optimising occurrence data in species distribution models: sample size, positional uncertainty, and sampling bias matter
  • Aug 2, 2024
  • Ecography
  • Vítězslav Moudrý + 27 more

Species distribution models (SDMs) have proven valuable in filling gaps in our knowledge of species occurrences. However, despite their broad applicability, SDMs exhibit critical shortcomings due to limitations in species occurrence data. These limitations include, in particular, issues related to sample size, positional uncertainty, and sampling bias. In addition, it is widely recognised that the quality of SDMs as well as the approaches used to mitigate the impact of the aforementioned data limitations depend on species ecology. While numerous studies have evaluated the effects of these data limitations on SDM performance, a synthesis of their results is lacking. However, without a comprehensive understanding of their individual and combined effects, our ability to predict the influence of these issues on the quality of modelled species–environment associations remains largely uncertain, limiting the value of model outputs. In this paper, we review studies that have evaluated the effects of sample size, positional uncertainty, sampling bias, and species ecology on SDMs outputs. We build upon their findings to provide recommendations for the critical assessment of species data intended for use in SDMs.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.