The study of soil mean weight diameter (MWD), essential for sustainable soil management, has recently received much attention. As the estimation of MWD is challenging, labor-intensive, and time-consuming, there is a crucial need to develop a predictive estimation method to generate helpful information required for the soil health assessment to save time and cost involved in soil analysis. Pedotransfer functions (PTFs) are used to estimate parameters that are ‘difficult to measure’ and time-consuming with the help of ’easy to measure’ parameters. In the current study, empirical PTFs, i.e., multi-linear regression (MLR), and four machine learning based PTFs, i.e., artificial neural network (ANN), support vector machine (SVM), classification and regression trees (CART), and random forest (RF) were used for mean weight diameter prediction in Karnal district of Haryana, India. A total of 121 soil samples from 0‐15 and 15‐30 cm soil depths were collected from seventeen villages of Nilokheri, Nissing, and Assandh blocks of Karnal district. Soil parameters such as bulk density (BD), fractal dimension (D), soil texture (i.e., sand, silt, and clay), organic carbon (OC), and glomalin content were used as the input variables. Two input combinations, i.e., one with texture data (dataset 1) and the other with fractal dimension data replacing texture (dataset 2), were used, and the complete dataset (121) was divided into training and testing datasets in a 4:1 ratio. The model performance was evaluated by statistical parameters such as mean absolute error (MAE), mean absolute percentage error (MAPE), root mean square error (RMSE), normalized root mean square error (NRMSE), and determination coefficient (R2). The comparison results showed that including the fractal dimension in the input dataset improved the prediction capability of ANN, SVM, and RF. MLR and CART showed lower predictive ability than the other three approaches (i.e., ANN, SVM, and RF). In the training dataset, RMSE (mm) for the SVM model was 8.33% lower with D than with texture as the input, whereas, in the testing dataset, it was 16.67% lower. Because SVM is more flexible and effectively captures non-linear relationships, it performed better than the other models in predicting MWD. As seen in this study, the SVM model with input data D is the best in its class and has a high potential for MWD prediction in the Karnal district of Haryana, India.
Read full abstract