Aboveground biomass (AGB) estimates derived from Landsat’s spectral bands are limited by spectral saturation when AGB densities exceed 150–300 Mg ha−1. Statistical features that characterize image texture have been proposed as a means to alleviate spectral saturation. However, apart from Gray Level Co-occurrence Matrix (GLCM) statistics, many spatial feature engineering techniques (e.g., morphological operations or edge detectors) have not been evaluated in the context of forest AGB estimation. Moreover, many prior investigations have been constrained by limited geographic domains and sample sizes. We utilize 176 lidar-derived AGB maps covering ∼9.3 million ha of forests in the Pacific Northwest of the United States to construct an expansive AGB modeling dataset that spans numerous biophysical gradients and contains AGB densities exceeding 1000 Mg ha−1. We conduct a large-scale inter-comparison of multiple spatial feature engineering techniques, including GLCMs, edge detectors, morphological operations, spatial buffers, neighborhood vectorization, and neighborhood similarity features. Our numerical experiments indicate that statistical features derived from GLCMs and spatial buffers yield the greatest improvement in AGB model performance out of the spatial feature engineering strategies considered. Including spatial features in Random Forest AGB models reduces the root mean squared error (RMSE) by 9.97 Mg ha−1. We contextualize this improvement model performance by comparing to AGB models developed with multi-temporal features derived from the LandTrendr and Continuous Change Detection and Classification algorithms. The inclusion of temporal features reduces the model RMSE by 18.41 Mg ha−1. When spatial and temporal features are both included in the model’s feature set, the RMSE decreases by 21.71 Mg ha−1. We conclude that spatial feature engineering strategies can yield nominal gains in model performance. However, this improvement came at the cost of increased model prediction bias.
Read full abstract