External factors including moisture content negatively affect the prediction accuracy of soil organic carbon (SOC) using on-line visible and near-infrared (vis-NIR) spectroscopy. This study compared the performances of four algorithms to remove the moisture content effect [direct standardization (DS), piecewise direct standardization (PDS), external parameter orthogonalization (EPO), and orthogonal signal correction (OSC)] against non-corrected (NC) spectral models developed with partial least squares regression (PLSR), support vector machine (SVM), random forest (RF), and M5Rules regression. An on-line soil sensing platform coupled with a vis-NIR spectrophotometer (305–1700 nm) was used to scan twelve agricultural fields in Belgium and France. A total of 372 soil samples collected during the on-line measurement were divided into a calibration (260) and a prediction (112) dataset, using the Kennard-Stone algorithm. The latter set together with identical laboratory-measured 112 dry soil spectra formed a transfer dataset to develop EPO, DS and PDS correction matrices. Results showed that models after EPO, PDS and OSC corrections resulted in improved accuracy [coefficient of determination (R2) = 0.60–0.82, root mean square error (RMSE) = 16.1–5.7 g kg−1)], compared to the NC models (R2 = 0.58–0.73, RMSE = 16.5–6.8 g kg−1), whereas the DS (R2 = −0.10 to 0.26, RMSE = 26.8–21.9 g kg−1) provided deteriorated prediction accuracy. The EPO and OSC models provided better prediction accuracy than that of the PDS corrected models. The OSC-M5Rules (R2 = 0.82, RMSE = 5.7 g kg−1) obtained the highest accuracy followed by EPO-M5Rules (R2 = 0.74, RMSE = 6.7 g kg−1) and NC-M5Rules (R2 = 0.73, RMSE = 6.8 g kg−1), which outperformed all PLSR, RF and SVM models. Therefore, on-line vis-NIR spectra should be corrected with the OSC algorithm before calibrating a machine learning model for accurate prediction of SOC.
Read full abstract