A trustworthiness indicator to select sample points for the individual predictive soil mapping method (iPSM)

Jing Liu,A-Xing Zhu,David Rossiter,Fei Du,James Burt

doi:10.1016/j.geoderma.2020.114440

Abstract

Observations of soil at georeferenced sample points are the indispensable input to digital soil mapping (DSM) models that relate soil properties and types to the soil-forming environment. Many existing DSM methods require soil samples to be collected following a pre-defined sampling pattern. This requirement is often not met due to various operational reasons. The individual sample-based predictive soil mapping (iPSM) method has been developed to avoid this requirement. However, the prediction accuracy of iPSM depends heavily on the reliability of the known sample points used in prediction. Unreliable sample points are those with unreliable soil-environment relationship: the target soil property value and environmental covariate data are not correctly paired at the sample point location. Such unreliable sample points will lead to poorly constructed models and low prediction accuracy. This paper presents a new method to estimate the reliability of soil-environment relationship at each sample location and to evaluate the trustworthiness of prediction at each unvisited location. Under the assumption that sample points with similar environmental conditions have similar soil property values, the method first identifies the sample points that are environmentally similar to the sample point to be evaluated, then uses their agreement on the targeted soil property to evaluate the reliability of soil-environment relationship at the sample point. The effectiveness of the method was assessed in a case study located in Anhui Province in China to map soil organic matter content (SOM, %) in the topsoil. When the reliability threshold increased from 0.5 to 0.8, more unreliable sample points are excluded from prediction: 41% of total sample points were excluded when the threshold was set to 0.5, and 88% were excluded when the threshold was set to 0.8. As a result, less unknown area could be predicted – only nine validation points were predictable with the 0.8 reliability threshold. However, the prediction accuracy was improved: the root mean squared error, RMSE, decreased from 1.37% to 0.63%, and R2 increased from 0.21 to 0.98. Prediction trustworthiness at each unvisited location was also produced along with prediction accuracy, which was negatively related to the absolute prediction residuals. This study shows that the reliability of individual sample points is an important determinant of the prediction accuracy of the iPSM method. When applying iPSM method, an optimal trade-off between prediction accuracy and completeness of the predicted map needs to be found by adjusting the reliability threshold on the sample points used in prediction.

Full Text