Abstract

Machine learning algorithms are widely used in product sorting processes in the food industry. The 
 attributes of the products are used in the classification process. Attributes vary for each product. In this 
 study, using the k nearest neighbor (KNN) algorithm, the classification of the wheat groups of Kama, 
 Rosa and Canada was performed. The Seeds dataset provided in UCI (University of California, Irvine) 
 machine learning open source data storage was used. There are 70 examples of each wheat class in the 
 data set. In addition, the classification estimation success of distance metrics and the number of training 
 data was measured. Each of the wheat samples was randomly selected and a soft X-ray technique was 
 used to visualize the inner core structure of the wheat in the experimental environment with high 
 quality. According to the training rates ranging from 50% to 90% of the data set, the classification 
 success of the KNN algorithm was tested. In the KNN algorithm, the neighborhood values 1, 3 and 5 
 were selected to affect the classification success. The successes of the Euclidean, Chebyshev, 
 Manhattan and Mahalanobis distance metric methods of the KNN algorithm were tested according to 
 each k neighborhood value. According to the results obtained, with the Mahalanobis metric method, a 
 classification success rate of 0.9924 accuracy was obtained according to the AUC (Area Under the Curve) 
 success metric by using the neighborhood value of k = 3. In the literature, there is no study comparing 
 the KNN algorithm, neighborhood values and distance vectors together on food data sets using varying 
 training and test data. Therefore, it is thought that the study will make an important contribution to 
 the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call