Abstract

<abstract> <bold><sc>Abstract. </sc></bold>The aim of this work was to select informative variables for modeling near-infrared spectra to soil nitrogen (N) and organic carbon (OC) and to provide interpretation for the selected variables. The dataset that consisted of 225 soil samples was randomly spilt into calibration set, validation set, and prediction set. Spectra in the calibration set were conducted for variable selection by the method of Monte Carlo uninformative variable elimination (MC-UVE) and successive projections algorithm (SPA). Partial least squares regression (PLSR) and multiple linear regression (MLR) were used to construct calibration models for each property based on the selected variables. The proposed model MC-UVE-PLSR achieved the optimal performance for soil N and OC comparing with full spectra PLSR and SPA-MLR. The coefficients of determination (R<sup>2</sup>), residual prediction deviation (RPD) were, respectively, 0.87, 0.88, and 2.8, 2.9 for N and OC. The results indicate that MC-UVE is an effective tool for spectral variable selection and is able to promote model prediction accuracy and efficiency. Analysis of the feature variables show that some of the selected variables of soil N and OC are directly related to their functional groups, while some influence the prediction results by measuring soil moisture content. It is also considered that soil N is better predicted by its own feature variables rather than calculation by the autocorrelation of soil OC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call