National soil organic carbon (SOC) maps are essential to improve greenhouse gas accounting and support climate-smart agriculture. Large-scale SOC models based on wall-to-wall soil information from remote sensing remain a challenge due to the high diversity of natural soil conditions and the difficulty of accounting for the spatial location of the soil samples. In this study, we tested if the implementation of local ensemble models (LEM) can be used to improve the SOC predictions from Landsat-based soil reflectance composites (SRC) for Germany. For this, we divided the research area into 30 times 30 km tiles and calculated local generalized linear models (GLM) based on random, nearby observations. Based on the GLMs, local SOC maps were predicted and aggregated using a moving window approach. The local variable importance was analyzed to identify spatial dependencies in the correlation between the SRC and SOC. For the final SOC map, a Random Forest (RF) model was trained using the aggregated local SOC predictions, the SRC, and a full set of training samples from the agricultural soil inventory. The results show that the LEM was able to improve the accuracy (R2 = 0.68; RMSE = 5.6 g kg−1), compared to the maps based on a single, global model (R2 = 0.52; RMSE = 6.8 g kg−1). The local variable importance of the spectral bands showed clear spatial patterns throughout the research area. Differences can be explained by the local soil conditions, influencing the correlation between SOC and the spectral properties. Compared to the widely adopted integration of distance covariates such as geographical coordinates, the LEM was able the reduce the spatial autocorrelation to a greater extent and to improve the prediction accuracy, especially for underrepresented SOC values. The LEM presents a new method to integrate spatial information and increase the interpretability of DSM models.
Read full abstract