Submerged aquatic vegetation, referring to benthic macroalgae and plants that obligately grow underwater, are critical components of marine ecosystems and are frequently found to provide preferential recruitment habitats. The mapping and monitoring of aquatic vegetation through remote sensing and machine learning is becoming an important aspect of managing coastal environments at scale. Accurate mapping and monitoring require robust sampling and occurrence data to assess predictive error and quantify submerged vegetation extents. The form of ground truthing survey design (preferential, random, grid-based or spatially balanced) could significantly influence predictive model outcomes and the overall accuracy of mapping and monitoring. Here, we test and contrast mapping aquatic vegetation extent ground-truthed using two different sampling designs: we used both preferential and spatially balanced sampling designs across four coastal sites along the midwest of Australia. We validate the map outcomes using spatial cross-validation and demonstrate that spatially balanced ground truthing significantly outperforms preferential sampling designs regarding modelled extent and map accuracy. In our comparison, we found that, on average, preferential designs overestimated vegetation extent by 25 percent compared to balanced designs and achieved an average kappa statistic, F1 score and Area under the Curve of 0.48, 0.615 and 0.517, respectively; whereas balanced designs achieved a kappa statistic, F1 score and AUC of 0.84, 0.85 and 0.83 respectively. We strongly recommend that sampling designs for remote sensing-derived habitat models be spatially balanced where habitat extent is proposed as a metric for monitoring.
Read full abstract