Abstract

This study investigates sampling design for mapping soil classes based on multiple environmental features associated with the soil classes. Two types of sampling design for calibrating the prediction models are compared: conditioned Latin hypercube sampling (CLHS) and feature space coverage sampling (FSCS). Simple random sampling (SRS), which does not utilize the environmental features, is added as a reference design. The sample sizes used are 20, 30, 40, 50, 75, and 100 points, and at each sample size 100 sample sets were drawn using each of the three types of design. Each of these sample sets was then used to calibrate three prediction models: random forest (RF), individual predictive soil mapping (iPSM), and multinomial logistic regression (MLR). These sampling designs were compared based on the overall accuracy of predicted soil class maps obtained by these three prediction methods. The comparison was conducted in two study areas: Ammertal (Germany) and Raffelson (USA). For each of these two areas a detailed legacy soil class map is available. These soil class maps were used as references in a simulation study for the comparison. Results of both study areas show that on average FSCS outperforms CLHS and SRS for all three prediction methods. The difference in estimated medians of overall accuracy with CLHS and SRS was marginal. Moreover, the variation in overall accuracy among sample sets of the same size was considerably smaller for FSCS than that for CLHS. These results in the two study areas suggest that FSCS is a more effective sampling design.

Highlights

  • Information on the spatial distribution of soil classes is of great importance for, amongst others, agriculture management and watershed process simulation (Cook et al, 2008; Lagacherie, 2008; McBratney et al, 2003; Sanchez et al, 2009)

  • The differences between the estimated medians and estimated means were very small, so when we refer to the estimated median hereafter, the statement holds for the estimated mean

  • For Ammertal, the difference in estimated medians with Simple random sampling (SRS) and conditioned Latin hypercube sampling (CLHS) was significant at the level of 0.05 for all sample sizes when using random forest (RF) and individual predictive soil mapping (iPSM) as a prediction method, but with multinomial logistic regression (MLR) these differences were not significant (Table 4)

Read more

Summary

Introduction

Information on the spatial distribution of soil classes is of great importance for, amongst others, agriculture management and watershed process simulation (Cook et al, 2008; Lagacherie, 2008; McBratney et al, 2003; Sanchez et al, 2009). As the soil cannot be observed everywhere, we need to predict the soil classes at unvisited locations from a finite set of observations at other locations. These observations can be used to calibrate a model relating the soil classes to environmental features whose spatial variation are readily available. The calibrated model can be used to predict the soil classes at any unvisited location in an area. The selection of observation locations (sample locations) for calibration is a important step (Brus, 2019) because they directly impact the calibration of the prediction model. Sampling design, which determines the calibration sampling locations, directly affects the prediction accuracy of the spatial distribution of soil classes. How to improve the sampling efficiency, achieving a highly accurate and detailed soil class map with limited samples, is an important issue in soil sampling

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call