Summary 1. Benthic macroinvertebrates (MI) are commonly used to assess freshwater ecosystems with the reference condition approach. Such assessments necessitate control for natural community variation, either by categorical typologies or by predictive models that have been widely and successfully developed for running water biota but not previously for lake profundal invertebrates. 2. We evaluated four modelling techniques [multivariate regression tree (MRT), limiting environmental differences, nonparametric multiplicative regression (NPMR) and River Invertebrate Prediction And Classification System (RIVPACS) and the operative Finnish lake typology for assessing taxonomic completeness (observed-to-expected number of taxa, O/E) of profundal MI assemblages. We used data from 74 and 33 minimally disturbed reference lake basins for calibration and validation of the approaches, respectively, and 72 test basins subject to various anthropogenic pressures to evaluate sensitivity to detect impact. Either all predicted taxa (threshold probability of capture Pt = 0+) or only those predicted to be captured with ≥0.25 probability were used to calculate O/E. 3. With Pt = 0.25, all four modelling approaches were accurate (mean O/E = 0.966–1.053) but imprecise (SD of O/E = 0.279–0.304) in predicting the fauna actually observed in validation sites. All models were subtly more precise than a null model (mean 1.038, SD 0.343) or the typology (1.046, 0.327). The taxon-specific NPMR model was slightly more precise than the other three models based on site groupings. 4. The O/E values correlated relatively weakly (r = 0.55–0.86) among the approaches, which thus produced contrasting lake-specific assessments, despite their seemingly comparable performances. Indeed, typology, suggesting that MI assemblages were impaired in 56% of test sites, was more sensitive than the other approaches (26–46%) as an indicator of human-induced deterioration. However, this greater ostensible sensitivity seemed to be biased, as lake morphometry, a main driver of natural community variation, remained uncontrolled by the typology. 5. Generally, our exercise illustrates the inconclusiveness of the common validation criteria for the assessment methods. The apparent poor predictability of the profundal fauna, irrespective of the method, may partly stem from large observation error, which could be alleviated by more intensive sampling. However, instead of an O/E-taxa index, some other metric encompassing quantitative aspects might be preferable for assessing these species-poor communities.
Read full abstract