K Nearest Neighbor Imputation Performance on Missing Value Data Graduate User Satisfaction

Abdul Fadlil,Herman Herman,Dikky Praseptian M

doi:10.29207/resti.v6i4.4173

Abdul Fadlil, Herman Herman + Show 1 more

Open Access

https://doi.org/10.29207/resti.v6i4.4173

Copy DOI

Abstract

A missing value is a common problem of most data processing in scientific research, which results in a lack of accuracy of research results. Several methods have been applied as a missing value solution, such as deleting all data that have a missing value, or replacing missing values with statistical estimates using one calculated value such as, mean, median, min, max, and most frequent methods. Maximum likelihood and expectancy maximization, and machine learning methods such as K Nearest Neighbor (KNN). This research uses KNN Imputation to predict the missing value. The data used is data from a questionnaire survey of graduate user satisfaction levels with seven assessment criteria, namely ethics, expertise in the field of science (main competence), foreign language skills, foreign language skills, use of information technology, communication skills, cooperation, and self-development. The results of testing imputation predictions using KNNI on user satisfaction level data for STMIK PPKIA Tarakanita Rahmawati graduates from 2018 to 2021. Where using the five k closest neighbors, namely 1, 5, 10, 15, and 20, the error value of the k nearest neighbors is 5 in RMSE is 0, 316 while the error value using MAPE is 3,33 %, both values are smaller than the value of k other nearest neighbors. K nearest neighbor 5 is the best imputation prediction result, both calculated by RMSE and MAPE, even in MAPE the error value is below 10%, which means it is very good.

Full Text