Abstract

The pedotransfer function is a mathematical model used to convert direct soil measurements into known and unknown soil properties. It provides information for modelling and simulation in soil research, hydrology, environmental science and climate change impacts, including investigating the carbon cycle and the exchange of carbon between soils and the atmosphere to support carbon farming. In particular, the pedotransfer function can provide input parameters for landscape design, soil quality assessment and economic optimisation. The objective of the study was to investigate the feasibility of using a generalised pedotransfer function derived with a machine learning method to predict soil electrical conductivity (EC) and soil organic carbon content (OC) for different regional locations in the state of Victoria, Australia. This strategy supports a unified approach to the interpolation and population of a single regional soils database, in contrast to a range of pedotransfer functions derived from local databases with measurement sets that may have limited transferability. The pedotransfer function generation was based on a machine learning algorithm incorporating the Generalized Linear Mixed Model with interactions and nested terms, with Residual Maximum Likelihood estimation, and a predictor-frequency ranking system with step-wise reduction of predictors to evaluate the predictive errors in reduced models. The source of the data was the Victorian Soil Information System (VSIS), which is a database administered for soil information and mapping purposes. The database contains soil measurements and information from locations across Victoria and is a repository of historical data, including monitoring studies. In total, data from 93 projects were available for inputs to modelling and analysis, with 5158 samples used to derive predictors for EC and 1954 samples used to derive predictors for OC. Over 500 models were tested by systematically reducing the number of predictors from the full model. Five-fold cross-validation was used for estimation of model mean-squared prediction error (MSPE) and mean-absolute percentage error (MAPE). The results were statistically significant with only a gradual reduction in error for the top-ranked 50 models. The prediction errors (MSPE and MAPE) of the top ranked model for EC are 0.686 and 0.635, and 0.413 and 0.474 for OC respectively. The four most frequently occurring predictors both for EC and OC prediction across the full set of models were found to be soil depth, pH, particle size distribution and geomorphological mapping unit. The possible advantages and disadvantages of this approach were discussed with respect to other machine learning approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call