Comparing to conventional laboratory methods, visible–near-infrared reflectance (vis–NIR) spectroscopy is a more practical and cost-effective approach for estimating soil physical and chemical properties. This paper aims to build statistical machine learning models to investigate the efficiency of spectral data for comprehensive evaluation of the soil quality indicators. Seventeen physical and chemical properties were measured using standard methods as indicators of soil quality. Soil samples were scanned in the laboratory in the vis–NIR range (350–2500 nm), the calibration set of 31 samples and the validation set of 13 samples for cross-validation and independent validation; twenty-four preprocessing methods were tested to improve predictions, and a partial least squares regression (PLSR) was used to predict soil quality indicators. Comparing model indices, the model constructed based on the PLSR machine learning method has a good predictive power (R2 > 0.9, ratio of performance to deviation (RPD) > 3.0). For physical and chemical properties, the bulk density (BD, R2 = 0.97, RPD = 5.90), soil organic matter (SOM, R2 = 0.98, RPD = 8.56), pH (R2 = 0.95, RPD = 4.40), and TN (R2 = 0.98, RPD = 6.67) concentration were predicted. This indicates that the method is suitable for the prediction of these soil elements in this study area. For the heavy properties, except for Mn, Zn, Cd, and As, the other five heavy metal concentrations were well predicted. It can be seen that the prediction ability of the construction model is Hg, Cr, Pb, Ni, and Cu in order of superiority to inferiority. The results show that a combination of spectroscopic and chemometric techniques can be applied as a practical, rapid, low-cost, and quantitative approach for evaluating soil physical and chemical properties in Shaanxi, China.
Read full abstract