Abstract

Predicting taxonomic classes can be challenging with dataset subject to substantial irregularities due to the involvement of many surveyors. A data pruning approach was used in the present study to reduce such source errors by exploring whether different data pruning methods, which result in different subsets of a major reference soil groups (RSG) – the Plinthosols – would lead to an increase in prediction accuracy of the minor soil groups by using Random Forest (RF). This method was compared to the random oversampling approach. Four datasets were used, including the entire dataset and the pruned dataset, which consisted of 80% and 90% respectively, and standard deviation core range of the Plinthosols data while cutting off all data points belonging to the outer range. The best prediction was achieved when RF was used with recursive feature elimination along with the non-oversampled 90% core range dataset. This model provided a substantial agreement to observation, with a kappa value of 0.57 along with 7% to 35% increase in prediction accuracy for smaller RSG. The reference soil groups in the Dano catchment appeared to be mainly influenced by the wetness index, a proxy for soil moisture distribution.

Highlights

  • Soils play a vital role for various ecosystem services, which makes them a key asset for sustainable living conditions on earth

  • The “scorpan” function introduces a conceptual framework for quantitative pedology and is defined as follows: Sc = f (s, c, o, r, p, a, n) where Sc is soil class, s is for soils and other soil attributes, c is climate, o is organism, r is relief, p is parent material, a is age, n is spatial location, and f is function or soil spatial prediction function (SSPF) model

  • The present study focused on covariates derived from terrain attributes from the Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) with a 90 m resolution[44], multi-temporal RapidEye and Landsat imagery bands, and indices calculated from them, along with maps of parent material, climate and land cover data, to unravel the complex soil environmental relations regarding spatial soil class distribution

Read more

Summary

Introduction

Soils play a vital role for various ecosystem services, which makes them a key asset for sustainable living conditions on earth. As a time- and cost-effective alternative to classical soil surveys, digital soil mapping (DSM)[13] is a subset of pedometrical research that uses geo-statistics and data mining methods to spatially predict soil classes or soil properties based on existing soil and environmental covariate data. These relations are described by the “scorpan” function developed by McBractney et al.[13] and is based on the soil state-factor equation initiated by Jenny[14]. The latter have become more popular in DSM and among the most commonly used DTs algorithms are C4.5/SEE522, CART (Classification and Regression Trees)[23], and Random Forest (RF)[24,25,26]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call