Predictor Selection and Machine Learning Regression Methods to Predict Saturated Hydraulic Conductivity From a Large Public Soil Database

Toby A Adjuik,Ole Wendroth,Michael Patrick Sama,Michael Montross,Sue E Nokes

doi:10.13031/ja.15068

Abstract

Highlights In this study, six machine learning (ML) models were developed using a large database of soils to predict saturated hydraulic conductivity of these soils using easily measured soil characteristics. Tree-based regression models outperformed all other ML models tested. Neural networks were not suitable for predicting saturated hydraulic conductivity. Clay content, followed by bulk density, explained the highest amount of variation in the data of the predictors examined. Abstract. One of the most important soil hydraulic properties for modeling water transport in the vadose zone is saturated hydraulic conductivity. However, it is challenging to measure it in the field. Pedotransfer Functions (PTFs) are mathematical models that can predict saturated hydraulic conductivity (Ks) from easily measured soil characteristics. Though the development of PTFs for predicting Ks is not new, the tools and methods used to predict Ks are continuously evolving. Model performance depends on choosing soil features that explain the largest amount of Ks variance with the fewest input variables. In addition, the lack of interpretability in most “black box” machine learning models makes it difficult to extract practical knowledge as the machine learning process obfuscates the relationship between inputs and outputs in the PTF models. The objective of this study was to develop a set of new PTFs for predicting Ks using machine learning algorithms and a large database of over 8000 soil samples (the Florida Soil Characterization Database) while incorporating statistical methods to inform predictor selection for the model inputs. Of the machine learning (ML) models tested, random forest regression (RF) and gradient-boosted regression (GB) gave the best performances, with R2 = 0.71 and RMSE = 0.47 cm h-1 on the test data for both. Using the permutation feature importance technique, the GB and RF regression models showed similar results, where clay content described the most variation in the data, followed by bulk density. The implication of this study is that, when predicting Ks using the Florida Soil Characterization Database, priority should be given to obtaining quality data on clay content and bulk density as they are the most influential predictors for estimating Ks. Keywords: Deep learning, Gradient boosted regression, Pedotransfer functions, Random forest regression, Soil database, Soil properties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Predictor Selection and Machine Learning Regression Methods to Predict Saturated Hydraulic Conductivity From a Large Public Soil Database

Abstract

Talk to us

Similar Papers

More From: Journal of the ASABE

Lead the way for us

Similar Papers

Response surface methodology and machine learning based tensile strength prediction in ultrasonic assisted coating of poly lactic acid bone plates manufactured using fused deposition modeling
Shrutika Sharma ... Deepa Mudgal
Ultrasonics | VOL. 137
Shrutika Sharma, et. al.Shrutika Sharma ... Deepa Mudgal
13 Nov 2023
Ultrasonics | VOL. 137

Phenotype Based Smart Mobile Application for Crop Yield Prediction and Forecasting Using Machine Learning and Time Series Models
S Iniyan ... R Jebakumar
Journal of Mobile Multimedia | VOL. -
S Iniyan, et. al.S Iniyan ... R Jebakumar
22 Jan 2022
Journal of Mobile Multimedia | VOL. -

Forecasting PM10 Concentrations in the Caribbean Area Using Machine Learning Models
Thomas Plocoste ... Sylvio Laventure
Atmosphere | VOL. 14
Thomas Plocoste, et. al.Thomas Plocoste ... Sylvio Laventure
07 Jan 2023
Atmosphere | VOL. 14

Machine Learning Based Crop Yield Prediction Model in Rajasthan Region of India
Kavita Jhajharia ... Pratistha Mathur
Iraqi Journal of Science | VOL. -
Kavita Jhajharia, et. al.Kavita Jhajharia ... Pratistha Mathur
30 Jan 2024
Iraqi Journal of Science | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predictor Selection and Machine Learning Regression Methods to Predict Saturated Hydraulic Conductivity From a Large Public Soil Database

Abstract

Talk to us

Similar Papers

More From: Journal of the ASABE