The present aim was to compare the accuracy of several algorithms in classifying data collected from food scent samples. Measurements using an electronic nose (eNose) can be used for classification of different scents. An eNose was used to measure scent samples from seven food scent sources, both from an open plate and a sealed jar. The k-Nearest Neighbour (k-NN) classifier provides reasonable accuracy under certain conditions and uses traditionally the Euclidean distance for measuring the similarity of samples. Therefore, it was used as a baseline distance metric for the k-NN in this paper. Its classification accuracy was compared with the accuracies of the k-NN with 66 alternative distance metrics. In addition, 18 other classifiers were tested with raw eNose data. For each classifier various parameter settings were tried and compared. Overall, 304 different classifier variations were tested, which differed from each other in at least one parameter value. The results showed that Quadratic Discriminant Analysis, MLPClassifier, C-Support Vector Classification (SVC), and several different single hidden layer Neural Networks yielded lower misclassification rates applied to the raw data than k-NN with Euclidean distance. Both MLP Classifiers and SVC yielded misclassification rates of less than when applied to raw data. Furthermore, when applied both to the raw data and the data preprocessed by principal component analysis that explained at least or of the total variance in the raw data, Quadratic Discriminant Analysis outperformed the other classifiers. The findings of this study can be used for further algorithm development. They can also be used, for example, to improve the estimation of storage times of fruit.
Read full abstract