Abstract

Environmental covariates are fundamental inputs of digital soil mapping (DSM) based on the soil–environment relationship. It is normal to have invalid values (or recorded as NoData value) in individual environmental covariates in some regions over an area, especially over a large area. Among the two main existing ways to deal with locations with invalid environmental covariate data in DSM, the location-skipping scheme does not predict these locations and, thus, completely ignores the potentially useful information provided by valid covariate values. The void-filling scheme may introduce errors when applying an interpolation algorithm to removing NoData environmental covariate values. In this study, we propose a new scheme called FilterNA that conducts DSM for each individual location with NoData value of a covariate by using the valid values of other covariates at the location. We design a new method (SoLIM-FilterNA) combining the FilterNA scheme with a DSM method, Soil Land Inference Model (SoLIM). Experiments to predict soil organic matter content in the topsoil layer in Anhui Province, China, under different test scenarios of NoData for environmental covariates were conducted to compare SoLIM-FilterNA with the SoLIM combined with the void-filling scheme, the original SoLIM with the location-skipping scheme, and random forest. The experimental results based on the independent evaluation samples show that, in general, SoLIM-FilterNA can produce the lowest errors with a more complete spatial coverage of the DSM result. Meanwhile, SoLIM-FilterNA can reasonably predict uncertainty by considering the uncertainty introduced by applying the FilterNA scheme.

Highlights

  • Soil information at high resolution, accuracy, and spatial coverage completeness over a large area is increasingly essential for geoscientific modeling applications, such as ecological modeling, hydrological modeling, agricultural management, and land use management [1,2,3,4]

  • We propose a new method with a new scheme to overcome the above-mentioned limitations in the existing schemes for dealing with the NoData values of environmental covariates for digital soil mapping (DSM)

  • According to root mean square error (RMSE) and mean absolute error (MAE) values from each method under the cell-level test scenarios based on independent evaluation samples (Table 3), the Soil Land Inference Model (SoLIM)-FilterNA method obtained the lowest errors in terms of RMSE among those under test, as well as the lowest MAE, except for the MAE of random forest (RF) under the T(V5) scenario

Read more

Summary

Introduction

Soil information at high resolution, accuracy, and spatial coverage completeness over a large area is increasingly essential for geoscientific modeling applications, such as ecological modeling, hydrological modeling, agricultural management, and land use management [1,2,3,4]. Many often-used DSM methods such as SoLIM [8] and random forest algorithm [9,10] use this scheme For those cells with NoData values for a few covariates (e.g., only one covariate) and valid values for the other covariates, the location-skipping scheme completely ignores the potentially useful information provided by valid covariate values for these cells. Note that each of the environmental covariates may have NoData values for different regions or locations This scheme may worsen the completeness of data layers, i.e., resulting in a larger area with NoData in the predicted soil map than that in any environmental covariate layer. The limitation of the void-filling scheme is that the accuracy of the DSM result will be affected because of the errors introduced by the average value estimation or interpolation algorithm used, the propagated and accumulated errors during iterative interpolation applied to a continuous area with NoData [13]. Complete spatial coverage of the DSM result can be attained as much as possible, while there is no error introduced by interpolation of NoData values

Basic Idea
Detailed Design of the Proposed Method
Study Area and Data
Experimental Design
Evaluation Method
Under the Cell-Level Test Scenarios
Methods
Under the Block-Level Test Scenarios
Prediction Uncertainty
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call