Abstract

In response to the demand for spatial information on the soil to support the sustainable management of soil resources, this study applies a digital soil mapping approach to predict soil classes for a 7000 ha area, located in Kurdistan province, Iran. Based on a stratified random sampling design, 91 soil profiles were situated, described, and classified into soil great groups. Environmental covariates used for modelling soil classes included terrain derivatives, remote sensing data, distance-based rasters, and legacy geospatial information (e.g., geological map). To address the issue of data multi-collinearity among the predictors, three dimensionality reduction techniques were tested: the principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and the novel Uniform Manifold Approximation and Projection (UMAP). An initial suite of 160 environmental covariates was reduced to 10 for all the methods and used to train a Random Forest (RF) model. The most effective model coupled UMAP with the Random Forest (RF-UMAP) machine-learner, which yielded a kappa index and overall accuracy values of 0.73 and 0.80, respectively. Within Kurdistan, topography and parent material were the main soil-forming factors influencing the prediction of the soil classes. Overall, the use of UMAP outperformed PCA and t-SNE. This study demonstrates the value of using advanced dimension reduction methods to facilitate the handling of non-linear relationships among predictor variables when using RF.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call