The quality and price of navel oranges vary depending on their geographical origin, thus providing a financial incentive for origin fraud. To prevent this phenomenon, it is necessary to explore a fast, non-destructive, and precise method for tracing the origin of navel oranges. In this study, a total of 490 Newhall navel oranges were selected from five major production regions in China, and the diffuse reflectance near-infrared spectrum in 4000–10,000 cm−1 were non-invasively collected. We examined seven preprocessing techniques for the spectra, including Savitzky–Golay (SG) smoothing, first derivative (FD), multiplicative scattering correction (MSC), combinations of SG with MSC (SG+MSC), SG with FD (SG+FD), MSC with FD (MSC+FD), and three combined (SG+MSC+FD). A one-dimensional convolutional neural network (1DCNN) deep learning model for geographical origin tracing of navel orange was established, and five machine learning algorithms, i.e., partial least squares discriminant analysis (PLS-DA), linear discriminant analysis (LDA), support vector machine (SVM), random forest (RF), and back-propagation neural network (BPNN), were compared with 1DCNN. The results show that the 1DCNN model based on the SG+FD preprocessing method achieved the optimal performance for the testing set, with prediction accuracy, precision, recall, and F1-score of 97.92%, 98%, 97.95%, and 97.90%, respectively. Therefore, NIRS combined with deep learning has a significant research and application value in the rapid, nondestructive, and accurate geographical origin traceability of agricultural products.
Read full abstract