Abstract

The prediction of species distributions is a primary goal in the study, conservation, and management of fisheries resources. Statistical models relating patterns of species presence or absence to multiscale habitat variables play an important role in this regard. Researchers, however, have paid little attention to how improper model validation and chance predictions can result in unfounded confidence in the performance and utility of such models. Using simulated and empirical data for 40 lake and stream fish species, we demonstrate that the commonly employed resubstitution approach to model validation (in which the same data are used for both model construction and prediction) produces highly biased estimates of correct classification rates and consequently an inaccurate perception of true model performance. In contrast, a jackknife approach to validation resulted in relatively unbiased estimates of model performance. The estimated rates of model correct classification are also shown to be substantially influenced by species prevalence (i.e., the proportion of sites at which a species is present), and often result in poorly performing models being viewed as powerful. We use simulated data to show how the expected frequency of chance predictions from models is a function of species prevalence and sample size. Finally, we use empirical data to illustrate a randomization approach for assessing whether the performances of the fish habitat models are statistically greater than expectations based on chance predictions. In summary, we urge researchers to employ proper and defensible methodologies for model validation and prediction assessment; failing to do so will only add to the accumulating number of published species habitat models in the fisheries literature that are of limited use and reliability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call