Analisis Penerapan Normalisasi Data Dengan Menggunakan Z-Score Pada Kinerja Algoritma K-NN

Raditya Galih Whendasmoro,Joseph Joseph

doi:10.30865/jurikom.v9i4.4526

Raditya Galih Whendasmoro, Joseph Joseph

Open Access

https://doi.org/10.30865/jurikom.v9i4.4526

Copy DOI

Journal: JURIKOM (Jurnal Riset Komputer)	Publication Date: Aug 30, 2022
License type: CC BY-SA 4.0

Affiliation: Universitas Bung Karno

Abstract

The large volume of information in the data causes a lot of data to be stored in the dataset. The dataset consists of various attributes and attribute values which contain information stored in the dataset. Data mining is a process that can be used to search for information on datasets. However, the problems encountered in the dataset are often found to have abnormal data such as the range of values that are too far and different between dataset attributes. The value range that is too far causes the results of the information obtained to be not optimal, in data mining itself the process or results are good based on the quality of the data stored in the dataset. Data normalization is a preprocessing stage, where data normalization is scaled back to the range of values in the attribute. Z-Score Normalization is a statistical technique that can be used in data mining to preprocess data by performing data transformations. Z-Score Normalization can be combined with data mining classification techniques, where the role of Z-Score Normalization is to normalize data which is useful for improving the performance of data mining classification algorithms, especially the K-NN algorithm in this study. The results of the study show that Z-Score Normalization is useful for improving performance than the K-NN algorithm. This can be seen from the increase in the accuracy value obtained from the K-NN process before normalizing the dataset and after normalizing the dataset. The accuracy values respectively before normalizing the dataset were 95.13%, 95.83%, 96.11%, 95.77% and 95.81% after normalizing the dataset there was an increase in the accuracy value, namely 97.87%, 98, 57%, 98.77%, 97.23% and 98.11%.

Full Text