Abstract

Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia’s marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to ‘small p and large n’ problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and caution should be taken when applying filter FS methods in selecting predictive models.

Highlights

  • Seabed substrate data is important baseline environmental information for supporting the sustainable management of Australia’s marine jurisdiction

  • We reduced the full model by gradually removing the least important variable(s) from the previous model based on the variable importance measure by Random forest (RF), which resulted in 22 models

  • Correct classification rates gradually increased from model 1 and reached a maximum mean (i.e. 87.64%) for model 14, except that the predictive accuracy (PA) of models 3, 5 and 11 slightly decreased after the removal of surface, relief and variance respectively

Read more

Summary

Introduction

Seabed substrate data is important baseline environmental information for supporting the sustainable management of Australia’s marine jurisdiction. Seabed substrate is an important factor controlling the spatial distribution of benthic marine communities as it influences the colonisation and formation of ecological communities and the abundance of benthic organisms [1,2,3,4,5,6]. A spatially continuous measurement of seabed hardness would be a significant aid in predicting the spatial distribution of benthic marine communities and to marine ecosystem management. It can be used for the sustainable exploitation of marine resources and planning infrastructure (e.g. selection of pipeline routes)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call