Comparing Machine Learning Models and Hybrid Geostatistical Methods Using Environmental and Soil Covariates for Soil pH Prediction

Panagiotis Tziachris,Vassilis Aschonitis,Ioannis (John) D Doukas,Theocharis Chatzistathis,Maria Papadopoulou

doi:10.3390/ijgi9040276

Abstract

In the current paper we assess different machine learning (ML) models and hybrid geostatistical methods in the prediction of soil pH using digital elevation model derivates (environmental covariates) and co-located soil parameters (soil covariates). The study was located in the area of Grevena, Greece, where 266 disturbed soil samples were collected from randomly selected locations and analyzed in the laboratory of the Soil and Water Resources Institute. The different models that were assessed were random forests (RF), random forests kriging (RFK), gradient boosting (GB), gradient boosting kriging (GBK), neural networks (NN), and neural networks kriging (NNK) and finally, multiple linear regression (MLR), ordinary kriging (OK), and regression kriging (RK) that although they are not ML models, they were used for comparison reasons. Both the GB and RF models presented the best results in the study, with NN a close second. The introduction of OK to the ML models’ residuals did not have a major impact. Classical geostatistical or hybrid geostatistical methods without ML (OK, MLR, and RK) exhibited worse prediction accuracy compared to the models that included ML. Furthermore, different implementations (methods and packages) of the same ML models were also assessed. Regarding RF and GB, the different implementations that were applied (ranger-ranger, randomForest-rf, xgboost-xgbTree, xgboost-xgbDART) led to similar results, whereas in NN, the differences between the implementations used (nnet-nnet and nnet-avNNet) were more distinct. Finally, ML models tuned through a random search optimization method were compared with the same ML models with their default values. The results showed that the predictions were improved by the optimization process only where the ML algorithms demanded a large number of hyperparameters that needed tuning and there was a significant difference between the default values and the optimized ones, like in the case of GB and NN, but not in RF. In general, the current study concluded that although RF and GB presented approximately the same prediction accuracy, RF had more consistent results, regardless of different packages, different hyperparameter selection methods, or even the inclusion of OK in the ML models’ residuals.

Highlights

Environmental sciences have always been interested in accurately predicting the spatial distributions of different phenomena regarding soil, water, air, etc. [1,2,3,4]
The results showed that the predictions were improved by the optimization process only where the machine learning (ML) algorithms demanded a large number of hyperparameters that needed tuning and there was a significant difference between the default values and the optimized ones, like in the case of gradient boosting (GB) and neural networks (NN), but not in random forests (RF)
From the results of the current study, it is obvious that ML models outperformed the other methods in predicting soil pH, like multiple linear regression (MLR) or models that use kriging (RK or ordinary kriging (OK))

Summary

Introduction

Environmental sciences have always been interested in accurately predicting the spatial distributions of different phenomena regarding soil, water, air, etc. [1,2,3,4]. The increased numbers of digital data (Internet of Things, high-accuracy digital elevation models (DEM), satellite images) present a great opportunity for improved prediction results. The prediction of spatial phenomena was achieved with the use of spatial prediction methods which mainly fell into the following two categories: deterministic methods, like inverse distance weighting or nearest neighbors, and stochastic ones, like regression models and kriging variations (e.g., ordinary kriging, universal kriging, etc.). Hybrid methods were introduced [5,6,7] that were partially deterministic, partially stochastic, like regression kriging (RK) or kriging with external drift (KED). These methods tried to combine the advantages of both worlds, deterministic and stochastic, achieving improved results. More innovative implementations of the abovementioned hybrid methods are increasingly used; they introduce machine learning (ML) as the deterministic part, along with kriging of the ML residuals as the stochastic part [8,9,10,11,12]

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS International Journal of Geo-Information	Publication Date: Apr 23, 2020
Citations: 20	License type: CC BY 4.0

R Discovery Prime

Comparing Machine Learning Models and Hybrid Geostatistical Methods Using Environmental and Soil Covariates for Soil pH Prediction

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information

Lead the way for us

Similar Papers

Machine learning approaches for formation matrix volume prediction from well logs: Insights and lessons learned
Pamidi Venkata Durga Kannaiah ... Neetish Kumar Maurya
Geoenergy Science and Engineering | VOL. 229
Pamidi Venkata Durga Kannaiah, et. al.Pamidi Venkata Durga Kannaiah ... Neetish Kumar Maurya
08 Jul 2023
Geoenergy Science and Engineering | VOL. 229

Application of Machine Learning to Interpret Steady-State Drainage Relative Permeability Experiments
Eric Sonny Mathew ... Emad W Al-Shalabi
SPE Reservoir Evaluation & Engineering | VOL. 26
Eric Sonny Mathew, et. al.Eric Sonny Mathew ... Emad W Al-Shalabi
22 Mar 2023
SPE Reservoir Evaluation & Engineering | VOL. 26

Optimisation and interpretation of machine and deep learning models for improved water quality management in Lake Loktak
Swapan Talukdar ... Atiqur Rahman
Journal of Environmental Management | VOL. 351
Swapan Talukdar, et. al.Swapan Talukdar ... Atiqur Rahman
25 Dec 2023
Journal of Environmental Management | VOL. 351

Improving asphalt mix design by predicting alligator cracking and longitudinal cracking based on machine learning and dimensionality reduction techniques
Jian Liu ... Hongren Gong
Construction and Building Materials | VOL. 354
Jian Liu, et. al.Jian Liu ... Hongren Gong
01 Nov 2022
Construction and Building Materials | VOL. 354

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Comparing Machine Learning Models and Hybrid Geostatistical Methods Using Environmental and Soil Covariates for Soil pH Prediction

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: ISPRS International Journal of Geo-Information