Comparative Analysis of Imputation Methods for Enhancing Predictive Accuracy in Data Models

Nurul Aqilah Zamri,- Rasyidah Rasyidah,M Izham Jaya,Indrarini Dyah Irawati,Shahreen Kasim,Taha H Rassem

doi:10.62527/joiv.8.3.1666

Abstract

The presence of missing values within datasets can introduce a detrimental bias, significantly impeding the predictive algorithm's ability to discern patterns and accurately execute prediction. This paper aims to elucidate the intricacies of data imputation methods, providing a more profound understanding of prevalent imputation methods, including list-wise deletion (IGN), mean imputation (AVG), K-Nearest Neighbors (KNN), MissForest (MF), and Predictive Mean Matching (PMM). The dataset employed in this study consists of financial data about S&P 500 companies in the Compustat North America database. The training and validation dataset encompasses 1973 instances, consisting of data during the fourth quarter of 2009, the first quarter of 2010, and the third quarter of 2014. Within this set, 457 missing values were identified and imputed. The test dataset comprises 197 randomly selected instances from the fourth quarter of 2014, equivalent to ten percent of the total instances in the training dataset. The evaluation findings prominently position the dataset derived from MF imputation as the leading performer among all the imputed datasets. The insights derived from this study are intended to assist practitioners in making informed choices when selecting the most suitable data imputation method, particularly in the context of predictive modeling tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comparative Analysis of Imputation Methods for Enhancing Predictive Accuracy in Data Models

Abstract

Talk to us

Similar Papers

More From: JOIV : International Journal on Informatics Visualization

Lead the way for us

Journal: JOIV : International Journal on Informatics Visualization	Publication Date: Sep 25, 2024
License type: CC BY-SA 4.0

Similar Papers

Comparing the performance of eight imputation methods for propensity score matching in missing data problem
Imran Kurt Omurlu ... Mevlut Ture
Journal of Statistics and Management Systems | VOL. 26
Imran Kurt Omurlu, et. al.Imran Kurt Omurlu ... Mevlut Ture
01 Jan 2023
Journal of Statistics and Management Systems | VOL. 26

Evaluation of machine learning methods for covariate data imputation in pharmacometrics.
Dominic Stefan Bräm ... Marc Pfister
CPT: Pharmacometrics & Systems Pharmacology | VOL. 11
Dominic Stefan Bräm, et. al.Dominic Stefan Bräm ... Marc Pfister
08 Nov 2022
CPT: Pharmacometrics & Systems Pharmacology | VOL. 11

Comparison of Single and MICE Imputation Methods for Missing Values: A Simulation Study
Nurul Azifah Mohd Pauzi ... Yap Bee Wah
Pertanika Journal of Science and Technology | VOL. 29
Nurul Azifah Mohd Pauzi, et. al.Nurul Azifah Mohd Pauzi ... Yap Bee Wah
30 Apr 2021
Pertanika Journal of Science and Technology | VOL. 29

Imputation methods on retrospective breast cancer data in Tanzania: A comparative study
Rahibu A Abassi ... Amina S Msengwa
Women Health Care and Issues | VOL. 5
Rahibu A Abassi, et. al.Rahibu A Abassi ... Amina S Msengwa
06 Jun 2022
Women Health Care and Issues | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Analysis of Imputation Methods for Enhancing Predictive Accuracy in Data Models

Abstract

Talk to us

Similar Papers

More From: JOIV : International Journal on Informatics Visualization