Maximum entropy (maxent) modelling is a widely used method for developing species distribution models (SDMs), but default maxent modelling methods can result in overly complex models with poor transferability. Methods suggested to reduce overfitting include increasing regularisation, using only linear and quadratic features, or applying forward selection of predictors using maximum likelihood (ML) methods. We built models using these options to determine environmental suitability within existing aquaculture zones for eight seaweed species, four red (Rhodophyta: Florideophyceae) and four brown (Ochrophyta: Phaeophyceae), that are being investigated for aquaculture in southern Australia. Forward selection models were the most parsimonious, but we encountered failure of ML methods for Pterocladia lucida (Rhodophyta) due to separation. Separation is a known issue for logistic regression and has recently been recognised in maxent models. Separation occurs where a variable, or combination of variables, is a perfect predictor for a binary response, here, species occurrence, and results in ML parameter estimates tending to infinity. One method for obtaining finite parameter estimates under separation is to apply a Cauchy prior distribution for coefficients. We therefore also built models for each species using a Cauchy-prior version of the forward selection method, and found that these models performed similarly to those built with ML methods. Default models achieved marginally higher predictive performance than other options based on training data metrics, but simpler models performed equivalently to, or better than, default models at predicting independent presence-absence test data. Predictive performance using test data varied considerably between species, but the difference in performance between models within each species was generally small. Our results confirm the concern that default maxent models may suffer from over-fitting and poor transferability. Model transferability and interpretability were important for our purpose, hence, based on the principle of parsimony, forward selection models were preferred. We also found that forward selection models retained similar predictive performance to the best model as assessed by each metric, further supporting use of these models. Where ML methods failed due to separation, the use of the Cauchy-prior method was a viable alternative. Predictions for the region of interest (Spencer Gulf, South Australia) were generated using the most parsimonious models, and Solieria robusta (Rhodophyta) showed the highest predicted suitability of the eight candidate species within existing aquaculture zones, especially in northern Spencer Gulf. Predicted suitability was low for the other Rhodophyta considered, while each of the Phaeophyceae showed moderate to high suitability in at least some southern Spencer Gulf aquaculture zones. These model results help to inform selection of the best candidate species and suitable farming areas for future research.
Read full abstract