Identification of high-oil content soybean using hyperspectral reflectance and one-dimensional convolutional neural network

Xihai Zhang,Jianxin Liao,Yue Yang,Hongbo Li,Kezhu Tan

doi:10.1080/00387010.2022.2160463

Abstract

It is of great significance to identify soybean seeds with high oil content since the oil content of soybean seeds decides oil yield. At present, related researches mostly used machine learning algorithm to identify soybean varieties with small samples. In this study, 5800 spectral data samples of 58 varieties in the range of 400–1000 nm were obtained. An acceptable method that combines hyperspectral imaging with one-dimensional convolutional neural network was proposed to distinguish high oilcontent soybean seeds. Moreover, traditional machine learning models, including support vector machine, k-nearest neighbor algorithm, and partial squares discriminant analysis, were also established in the experimental study. The effects of four preprocessing methods, namely moving window smoothing, standard normal variate, multivariate scattering correction, and Savitzky–Golay, were compared when building support vector machine-based identification models. The results showed that the model using multivariate scattering correction gave better test accuracy (94.5%), indicating that for this study, multivariate scattering correction was a more suitable method than others. Meanwhile, the study compared the performance of the four models by expanding the number of samples. The results showed that the proposed one-dimensional convolutional neural network model was more stable. The average accuracy of the training set and test set was 96% and 93%, respectively. Therefore, hyperspectral data combined with one-dimensional convolutional neural network was effective in identifying soybean seeds with high oil content.

Full Text