Abstract

The random forest (RF) model is a powerful machine learning technique that has been increasingly used for species distribution modeling (SDM) by ecologists and fisheries scientists given various threats to marine habitats and biodiversity. However, the observations for model training are often constrained by limited surveys and financial resources. Under these circumstances, identifying the appropriate sample size for modeling is important for successful predictions. In addition, species with different biological characteristics present various challenges for SDM, which needs to be considered when evaluating model performance. We built and evaluated RF models for 21 marine demersal species using catch data and environmental variables collected during a bottom trawl survey in the coastal waters of Shandong Peninsula, China. The predictive performances of the RF models were evaluated for eight sample sizes using cross validation, in which a range of 10–80 sample sites were used to train the model. The resulting predictive performance was examined for a range of biological and behavioral traits. For most species, the predictive performance of the RF model was substantially improved when the sample size increased from 10 to 30 sites, but less improvement was evident with larger datasets. An ANOVA identified significant influences of migratory behavior, lifespan, body size, feeding mode and prevalence on the model predictability, whereas the effects of trophic level and taxon were insignificant, as were the interactions between the sample size and species traits. The abundance distributions could be better predicted for benthivores, and species with short migratory distances, short lifespans, and small body sizes, and for each species trait, the variation in the relative predictive performances of the trait categorical groups was generally consistent among sample sizes and performance metrics. Our study may contribute to an improved understanding of successful SDM and provide guidance for the application of RF models to predict the abundance distributions of fish species.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call