Pecan (Carya illinoinensis K.), a well-known dried seed and woody oil tree, faces challenges in its industry due to complex quality assessment methods and confusing varieties. These challenges have seriously hampered the development of a large-scale pecan deep processing industry. This work aimed to apply hyperspectral imaging technology (HSI) combined with machine learning to evaluate the quality of pecan seeds and perform variety classification. The samples of this work were composed of 19 varieties of pecan seeds, with 30 seeds per variety. After spectral preprocessing, spectral features were extracted from the spectral profiles using feature extraction methods. Back-propagation neural network models and partial least squares models were established to predict the contents of crude fat and moisture in pecan seeds. Predictions of the best models gave good results with R2-score of 0.887 for the crude fat model and 0.950 for the moisture model. Additionally, support vector machine models were developed to identify pecan varieties. The model achieved good results in 19 pecan varieties identification with accuracy of 0.965. In conclusion, the combination of HSI and machine learning could be an effective tool in improving the pecan industry and providing sustainable and efficient methods in the production of pecan seeds.
Read full abstract