Rapid identification of soybean seed varieties is crucial for agricultural production and seed quality. Identifying varieties of soybean seed using conventional chemical methods is time-consuming, destructive, and inappropriate for seed quality evaluation. This study utilized hyperspectral imaging technology (HSI) to identify four varieties of soybean seeds. The hyperspectral images of soybean seeds were collected in the spectral range of 400–1000 nm. A multi-level data fusion strategy based on spectral and image information was proposed to improve the accuracy of model. Subsequently, the multi-level data fusion strategy based on partial least squares discriminant analysis (PLS-DA) was used to establish the classification models of soybean seeds. Compared with the models using individual analytical sources, the results demonstrated that the models with multi-level data fusion strategy obtained better prediction performance. The high-level data fusion (HLDF) based on Bayesian consensus provided the optimal results with an accuracy (Acc) and F1-score of 93.13 % and 93.70 % in the prediction phase, respectively. Therefore, the multi-level data fusion strategy can be used as an identification method for soybean seed varieties and an effective approach to enhance the discriminatory capability of models.