Abstract

Soil thermal conductivity (λ) is an important thermal property that is crucial for surface energy balance and water balance studies. 1602 measured soil thermal conductivity values representing 189 soils were used to evaluate five empirical models (i.e., de Vries (1963) model (de Vries 1963), Campbell (1985) model (Campbell1985), Johansen (1975) model (Johansen 1975), Côté and Konrad (2005) model (Côté and Konrad 2005), and Lu et al. (2007) model (Lu 2007)) and seven machine learning (ML) algorithms (i.e., Decision Tree (DT), Random Forest (RF), Gradient Boosting Decision Tree (GBDT), Linear Regression (LR), K-Nearest Neighbors (KNN), Neural Network (NN), and Gaussian Process (GP)) to estimate λ. Our results demonstrated that the average root mean squared error (RMSE) values of ML were 66% and 82% of the empirical model values on validation and test sets respectively. The three best ML algorithms (GBDT, NN, RF) performed significantly better than the three best empirical models (Lu 2007, Côté and Konrad 2005, Johansen 1975): 0.183 < RMSE < 0.259 (W m−1 K−1) for ML algorithms and 0.293 < RMSE < 0.320 (W m−1 K−1) for empirical models. For ML, we recommend the GBDT, NN and RF algorithms. For empirical models, we recommend to use three normalized models (Lu 2007, Côté and Konrad 2005, Johansen 1975) over the physically-based model (DV1963) and the regression model (CG1985). The feature importance rankings performed by the RF and GBDT algorithms show that soil moisture content and soil bulk density are the most critical factors affecting λ. Soil moisture content and soil bulk density together account for more than 80% of the influence importance value of λ. RF gives more consistent feature importance ranking results than GBDT, therefore, we recommend the use of RF for selecting features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call