The main goal of this study was to consider and compare the effects of different spatial resolutions of covariates from different sources on predicting SOC in a semi-arid region located in the west of Iran. For this purpose, 67 topsoil samples (0–30 cm) with the measured SOC contents were used as the dependent variable. The covariates controlling the SOC content from different sources were provided in two scenarios. For the first scenario (scenario I), six covariate sets with spatial resolution ranging from 2 to 30 m, and original and aggregated pixel sizes were prepared using the digital elevation models (DEMs) and remote sensing data to predict SOC. In the second scenario (scenario II), the available legacy data, including geology, land use and soil texture maps, were prepared with compatible spatial resolution and added to each covariate set provided for scenario I. After feature selection analysis, the modelling processes were performed using two machine learning models, namely, Random Forest (RF) and Support Vector Machine (SVM). The results of performance analysis, as obtained by leave one out cross validation (LOOCV), showed that the RF and covariate set B (with 10 m spatial resolution) in scenario I, with R2 = 0.21, CCC = 0.41, MAE = 0.26 and RMSE = 0.34%, and also, in scenario II, with R2 = 0.32, CCC = 0.51, MAE = 0.24, and RMSE = 0.32%, had a better performance in predicting SOC. In addition, the remote sensing data were identified as the most important variables controlling the spatial distribution of SOC. Finally, by using the RF model as the superior model, the SOC map provided by the covariate set B in scenario II, which was the combination of the three types of covariates (DEM, remote sensing data and legacy data), was shown to have the lowest uncertainty in comparison to the SOC provided by the covariate set B in scenario I. In general, our results showed that the model type, source, resolution and the combination of these variables could greatly influence the prediction outputs. In fact, the SOC map provided with the combination of parsimonious variables at the optimal pixel size could help decision-making in environmental resources management.
Read full abstract