Non-destructive, fast, and accurate prediction of soil organic matter content in farmland is of great significance for soil fertility assessment and rational fertilization. In the process of soil organic matter prediction, it is important to give full play to the advantages of different prediction models and to integrate different prediction models to innovatively construct a combined prediction model of soil organic matter content so as to improve the prediction accuracy and generalization ability of the model. In this study, the soil organic matter content of agricultural soils was taken as the research object, and the visible near-infrared hyperspectral curves of soils were measured by the Starter Kit indoor mobile scanning platform (Headwall Photonics, Bolton, MA, USA), and the original spectral curves were firstly de-noised by Savitzky–Golay (S-G) smoothing. Secondly, the smoothed and denoised spectral data were subjected to a first-order differential transform, and the features were selected based on the first-order differential transformed spectral data using the L1-paradigm algorithm features. Then, secondly, eight algorithms based on the selected feature bands, such as LASSO Regression (LASSO) (Model 1), Multilayer Perceptron (MLP) (Model 2), Random Forest (RF) (Model 3), Gaussian Kernel Regression (GKR) (Model 4), Ridge Regression (Model 5), Long Short-Term Memory (LSTM) (Model 6), Convolutional Neural Networks (CNN) (Model 7), and Support Vector Regression (SVR) (Model 8), were applied to construct a single-prediction model of soil organic matter content. Finally, a superior linear combination-prediction model was proposed by the eight single-prediction models constructed, and the standard deviation-based prediction validity was added to test the model. The results showed the following: (1) the weights of the eight single-prediction models in the combined prediction model were ω1*=0.099, ω2*=0.202, ω3*=0.000, ω4*=0.357, ω5*=0.088, ω6*=0.089, ω7*=0.000, ω8*=0.165, respectively; (2) The average precision E of the predicted values of soil organic matter content constructed based on the eight single-prediction models was 0.856; the average standard deviation σ was 0.181, and the average prediction validity M was 0.702; (3) The accuracy E of the predicted value of soil organic matter content of the combined model was 0.893, which was 4.322% higher than the average accuracy of the single model; the standard deviation of the combined model was 0.129, which was 28.333% lower than the average standard deviation of the single model, and the prediction validity M of the combined model was 0.778, which was 10.826% higher than the average prediction validity of the single model. The combined model can be used for the effective estimation of soil organic matter content in farmland with visible near-infrared spectral data, which can provide a basis and reference for the rapid detection of soil organic matter content in farmland.
Read full abstract