Abstract

Ecologists often develop models that describe the relationship between faunal communities and their habitat. Coral reef fishes have been the focus of numerous such studies, which have used a wide range of statistical tools to answer an equally wide range of questions. Here, we apply a series of both conventional statistical techniques (linear and generalized additive regression models) and novel machine-learning techniques (the support vector machine and three ensemble techniques used with regression trees) to predict fish species richness, biomass, and diversity from a range of habitat variables. We compare the techniques in terms of their predictive performance, and we compare a subset of the models in terms of the influence each habitat variable has for the predictions. Prediction errors are estimated by cross-validation, and variable importance is assessed using permutations of individual variable values. For predictions of species richness and diversity the tree-based models generally and the random forest model specifically are superior (produce the lowest errors). These model types are all able to model both nonlinear and interaction effects. The linear model, unable to model either effect type, performs the worst (produces the highest errors). For predictions of biomass, the generalized additive model is superior, and the support vector machine performs the worst. Depth range, the difference between maximum and minimum water depth at a given site, is identified as the most important variable in the majority of models predicting the three fish community variables. However, variable importance is highly dependent upon model type, which leads to questions regarding the interpretation of variable importance and its proper use as an indicator of causality. The representation of ecological relationships by tree-based ensemble learners will improve predictive performance, and provide a new avenue for exploring ecological relationships, both statistical and causal.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call