Abstract

Spatially continuous soil thickness data at large scales are usually not readily available and are often difficult and expensive to acquire. Various machine learning algorithms have become very popular in digital soil mapping to predict and map the spatial distribution of soil properties. Identifying the controlling environmental variables of soil thickness and selecting suitable machine learning algorithms are vitally important in modeling. In this study, 11 quantitative and four qualitative environmental variables were selected to explore the main variables that affect soil thickness. Four commonly used machine learning algorithms (multiple linear regression (MLR), support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGBoost) were evaluated as individual models to separately predict and obtain a soil thickness distribution map in Henan Province, China. In addition, the two stacking ensemble models using least absolute shrinkage and selection operator (LASSO) and generalized boosted regression model (GBM) were tested and applied to build the most reliable and accurate estimation model. The results showed that variable selection was a very important part of soil thickness modeling. Topographic wetness index (TWI), slope, elevation, land use and enhanced vegetation index (EVI) were the most influential environmental variables in soil thickness modeling. Comparative results showed that the XGBoost model outperformed the MLR, RF and SVR models. Importantly, the two stacking models achieved higher performance than the single model, especially when using GBM. In terms of accuracy, the proposed stacking method explained 64.0% of the variation for soil thickness. The results of our study provide useful alternative approaches for mapping soil thickness, with potential for use with other soil properties.

Highlights

  • Soil thickness is considered to play an important role in numerous areas, such as soil structure and function [1], vegetation growth [2], land surface energy flux [3], hydrology [4] and ecological land classification [5]

  • Exhaustive covariates and machine learning methods were applied to build the most reliable and accurate estimation model to provide the spatial distribution of soil thickness for Henan Province in China

  • The results suggested that using qualitative environmental variables could improve the accuracy of soil thickness estimations; in particular, each qualitative variable category showed significant differences with soil thickness values

Read more

Summary

Introduction

Soil thickness is considered to play an important role in numerous areas, such as soil structure and function [1], vegetation growth [2], land surface energy flux [3], hydrology [4] and ecological land classification [5]. Current soil thickness mapping methods can be classified into three categories: (1) physically based models, (2) empirical-statistical based models built using environmental covariates, and (3) interpolation from point samples [11] These mathematical or statistical methods are based on key landscape factors and processes that determine the formations of soil properties. Spatial patterns in soil thickness result from complex interactions of soil-forming environmental factors, including terrain relief, climate, parent material, biological factors, human activities, physical processes and time [6,12]. Quantitative environmental variables such as elevation, slope, aspect derived from digital elevation models and vegetation index derived from remote sensing data have been widely used. Qualitative environmental variables such as geomorphic maps, geological maps, land use types, and legacy soil maps could be important parameters for predicting soil properties [13,14]

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call