Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset

Tsasya Raudhatunnisa,Nori Wilantika

doi:10.34123/icdsos.v2021i1.93

Abstract

Missing value can cause bias and makes the dataset not represent the actual situation. The selection of methods for handling missing values is important because it will affect the estimated value generated. Therefore, this study aims to compare three imputation methods to handle missing values—Hot-Deck Imputation, K-Nearest Neighbor Imputation (KNNI), and Predictive Mean Matching (PMM). The difference in the way the three methods work causes the estimation results to be different. The criteria used to compare the three methods are the Root Mean Squared Error (RMSE), Unsupervised Classification Error (UCE), Supervised Classification Error (SCE), and the time used to run the algorithm. This study uses two pieces of analysis, comparison analysis, and scoring analysis. The comparative analysis applying a simulation that pays attention to the mechanism of missing value. The mechanism of the missing value used in the simulation is Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR). Then, scoring analysis aims to narrow down the results of comparative analysis by giving a score on the results of the imputation of the three methods. The result suggests Hot-Deck Imputation is the most excellent in dealing with a missing value based on the score.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset

Abstract

Talk to us

Similar Papers

More From: Proceedings of The International Conference on Data Science and Official Statistics

Lead the way for us

Journal: Proceedings of The International Conference on Data Science and Official Statistics	Publication Date: Jan 4, 2022
Citations: 2

Similar Papers

Empirical Performance Evaluation of Imputation Techniques using Medical Dataset
O A Alade ... A Selamat
IOP Conference Series: Materials Science and Engineering | VOL. 551
O A Alade, et. al.O A Alade ... A Selamat
01 Aug 2019
IOP Conference Series: Materials Science and Engineering | VOL. 551

An Improved Imputation Method for Accurate Prediction of Imputed Dataset Based Radon Time Series
Adil Aslam Mir ... Anwer Mustafa Hilal
IEEE Access | VOL. 10
Adil Aslam Mir, et. al.Adil Aslam Mir ... Anwer Mustafa Hilal
01 Jan 2021
IEEE Access | VOL. 10

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study
Andrea Marshall ... Douglas G Altman
BMC Medical Research Methodology | VOL. 10
Andrea Marshall, et. al.Andrea Marshall ... Douglas G Altman
19 Jan 2010
BMC Medical Research Methodology | VOL. 10

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset

Abstract

Talk to us

Similar Papers

More From: Proceedings of The International Conference on Data Science and Official Statistics