Abstract

Different geographical origins can lead to great variance in coffee quality, taste, and commercial value. Hence, controlling the authenticity of the origin of coffee beans is of great importance for producers and consumers worldwide. In this study, terahertz (THz) spectroscopy, combined with machine learning methods, was investigated as a fast and non-destructive method to classify the geographic origin of coffee beans, comparing it with the popular machine learning methods, including convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) to obtain the best model. The curse of dimensionality will cause some classification methods which are struggling to train effective models. Thus, principal component analysis (PCA) and genetic algorithm (GA) were applied for LDA and SVM to create a smaller set of features. The first nine principal components (PCs) with an accumulative contribution rate of 99.9% extracted by PCA and 21 variables selected by GA were the inputs of LDA and SVM models. The results demonstrate that the excellent classification (accuracy was 90% in a prediction set) could be achieved using a CNN method. The results also indicate variable selecting as an important step to create an accurate and robust discrimination model. The performances of LDA and SVM algorithms could be improved with spectral features extracted by PCA and GA. The GA-SVM has achieved 75% accuracy in a prediction set, while the SVM and PCA-SVM have achieved 50 and 65% accuracy, respectively. These results demonstrate that THz spectroscopy, together with machine learning methods, is an effective and satisfactory approach for classifying geographical origins of coffee beans, suggesting the techniques to tap the potential application of deep learning in the authenticity of agricultural products while expanding the application of THz spectroscopy.

Highlights

  • Coffee, as one of the most popular beverages in the world, is widely appreciated by consumers for its unique aroma, flavor, and refreshing effect [1,2,3]

  • We develop several methods that are based on THz spectroscopy, combined with machine learning, to classifying the geographical origins of coffee beans

  • The results provide a new idea and attempt for the application of THz spectroscopy and a machine learning method in food and agricultural applications

Read more

Summary

Introduction

As one of the most popular beverages in the world, is widely appreciated by consumers for its unique aroma, flavor, and refreshing effect [1,2,3]. The sensory properties of coffee are profoundly affected by the composition of coffee beans, which are mainly affected by climate characteristics associated with different latitudes and altitudes. Central and South Africa offer optimal climate conditions for coffee plants. A great variance in coffee quality, taste, and commercial value is found with different geographical origins [4,5,6,7]. This variability aspect might increase the risk of fraud, such as mislabeling of the product to conceal the true geographical origin of the coffee beans [8]. The development of analytical methods that could efficiently evaluate the geographical origin of coffee beans is highly encouraged by coffee producers and consumers

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call