Locally linear embedding (LLE) is a nonlinear dimensionality reduction method, which has great advantages over linear dimensionality reduction methods. However, the traditional LLE takes the Euclidean distance as the distance measure, which is difficult to accurately reflect the spatial position relationship between the high-dimensional data of the near-infrared spectrum, resulting in a poor modeling effect. This paper tries to improve the LLE with different distance metric methods and proposes a rapid detection method for maize seed germination rate based on improved local linear embedding and near-infrared spectroscopy. In this paper, a total of 315 samples from 7 different types of maize seeds, purchased from the seed market, were adopted as the research object. We performed artificial aging tests with 8 different gradients (from 0d to 7d with interval of 1d) on them and completed the germination rate test after collecting the near-infrared spectral data for each sample. The Monte Carlo cross-validation (MCCV) algorithm, combined with PLSR and SVM, was used to remove abnormal samples from the spectral and germination rate data. And then, for comparison, we used several different improvement strategies for LLE (traditional Euclidean distance, Manhattan distance, Chebyshev distance, Correlation coefficient, and Cosine similarity) to reduce the spectral data dimension and established PLS and SVM germination rate prediction models. We compared the prediction effects of different models to explore the optimal improvement strategy of LLE dimension reduction distance measurement. The results showed that the cosine similarity was the best improvement strategy under the same modeling method. The R2 of the LLE_ cos-PLS model's test set can reach 0.8384, and the R2 of the LLE_cos-SVM model's test set can reach 0.8765. The results showed that the cosine similarity could better reflect the spatial distribution in the spectral data of aged maize seeds, and the precision of the model was higher after LLE_cos dimensionality reduction. Compared with the linear modeling method PLS, the nonlinear modeling method SVM is more suitable for predicting the germination rate of maize seeds. This study can provide a reference method for the quality inspection of other agricultural products.
Read full abstract