Abstract

Soil maps provide a method for graphically communicating what is known about the spatial distribution of soil properties in nature. We proposed an optimized pipeline, named dino-soil toolbox, programmed in the R software for mapping quantitative and categorical properties of legacy soil data. The pipeline, composed of four main modules (data preprocessing, covariates selection, exploratory data analysis and modeling), was tested across a study area of 14,537 km 2 located between the departments of Cesar and Magdalena, Colombia. We [...]

Highlights

  • Soil classification is a method for organizing and communicating knowledge and perceptions about soil properties

  • We propose to derive estimates of uncertainty for quantitative variables by fitting a quantile regression forest model (QRF)

  • By providing multiple outputs such as tables, charts, maps, and geospatial data in four main steps, the pipeline offers considerable robustness to support outcomes and analysis of a digital soil mapping (DSM) project. These components are aligned to the recommendations by Wadoux et al (2020a) of plausibility, interpretability, and explainability in machine learning (ML)-DSM developments that enable soil scientists to couple model prediction with pedological explanation and understanding of the underlying soil processes

Read more

Summary

Introduction

Soil classification is a method for organizing and communicating knowledge and perceptions about soil properties. The existing methods in DSM can be grouped into two main modeling types, conventional (statistical and geostatistical) and machine learning (ML) In the former type, a soil property is modeled as a linear relationship between the property and state factors, accounting for the deterministic portion of the total variation, and a spatially dependent stochastic portion by using kriging methods (Keskin and Grunwald, 2018). Unlike geostatistical methods in which the transformation of the original observations is often required to satisfy assumptions, ML algorithms do not assume the observations’ distribution They are more suitable for large area predictions and designed to handle non-linear relations and complexity found in soil data (Padarian et al, 2020)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call