Comparison of Data Normalization for Wine Classification Using K-NN Algorithm

Rohitash Chandra

doi:10.47738/ijiis.v5i4.145

Abstract

The range of values that are not balanced on each attribute can affect the quality of data mining results. For this reason, it is necessary to pre-process the data. This preprocessing is expected to increase the accuracy of the results from the wine dataset classification. The preprocessing method used is data transformation with normalization. There are three ways to do data transformation with normalization, namely min-max normalization, z-score normalization, and decimal scaling. Data that has been processed from each normalization method will be compared to see the results of the best classification accuracy using the K-NN algorithm. The K used in the comparisons were 1, 3, 5, 7, 9, 11. Before classifying the normalized wine dataset, it was divided into test data and training data with k-fold cross validation. The division of the data using k is equal to 10. The results of the classification test with the K-NN algorithm show that the best accuracy lies in the wine dataset which has been normalized using the min-max normalization method with K = 1 of 65.92%. The average obtained is 59.68%.

Full Text