Comparative Study of Four Methods in Missing Value Imputations under Missing Completely at Random Mechanism

Michikazu Nakai,Ding-Geng Chen,Kunihiro Nishimura,Yoshihiro Miyamoto

doi:10.4236/ojs.2014.41004

Abstract

In analyzing data from clinical trials and longitudinal studies, the issue of missing values is always a fundamental challenge since the missing data could introduce bias and lead to erroneous statistical inferences. To deal with this challenge, several imputation methods have been developed in the literature to handle missing values where the most commonly used are complete case method, mean imputation method, last observation carried forward (LOCF) method, and multiple imputation (MI) method. In this paper, we conduct a simulation study to investigate the efficiency of these four typical imputation methods with longitudinal data setting under missing completely at random (MCAR). We categorize missingness with three cases from a lower percentage of 5% to a higher percentage of 30% and 50% missingness. With this simulation study, we make a conclusion that LOCF method has more bias than the other three methods in most situations. MI method has the least bias with the best coverage probability. Thus, we conclude that MI method is the most effective imputation method in our MCAR simulation study.

Highlights

Missing values often occur in clinical trials and longitudinal studies
Shrive et al [10] suggested that multiple imputation (MI) method was the most accurate method for dealing with missing data in most data scenarios, but in some situations, mean imputation method performed slightly better than MI method
White and Carlin [11] pointed out a similar concept, stating that complete case method was more efficient than MI method in some scenarios, even though MI method was widely advocated as an improvement over complete case method

Summary

Introduction

Missing values often occur in clinical trials and longitudinal studies. Whenever there are missing data, there is loss of information, which causes a reduction in efficiency or a drop in the precision in statistical inference. When the size of the dataset is large enough, analysis could be considered using complete case method where a subject is completely deleted whenever this subject has missing values at any measurement occasion. With this deletion, some statistical procedures and software do execute a program automatically, as though there are no missing values under this situation. The rule of thumb suggests that 20% or less of missing data is acceptable for imputation [1,2,3,4], no clear rules exist regarding how much is too much missing data [5]

Background

Missing Mechanism

Imputation Methods

Simulation Settings

Simulation Performance Measures

Missingness Mechanism

Simulation Result

Method Original Complete

Simulation in Other Scenarios

Simulation Result with Small ρ Value

Simulation Result with Unstructured Correlation Structure

Findings

Discussion and Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Open Journal of Statistics	Publication Date: Jan 1, 2014
Citations: 37	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Comparative Study of Four Methods in Missing Value Imputations under Missing Completely at Random Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Open Journal of Statistics

Lead the way for us

Similar Papers

How to deal with missing longitudinal data in cost of illness analysis in Alzheimer's disease-suggestions from the GERAS observational study.
Mark Belger ... Giuseppe Bruno
BMC medical research methodology | VOL. 16
Mark Belger, et. al.Mark Belger ... Giuseppe Bruno
18 Jul 2016
BMC medical research methodology | VOL. 16

CHOOSING APPROPRIATE IMPUTATION METHODS FOR MISSING DATA: A DECISION ALGORITHM ON METHODS FOR MISSING DATA
Wisam A Mahmood ... Mohammed S Rashid
Journal of Al-Qadisiyah for Computer Science and Mathematics | VOL. 11
Wisam A Mahmood, et. al.Wisam A Mahmood ... Mohammed S Rashid
05 Sep 2019
Journal of Al-Qadisiyah for Computer Science and Mathematics | VOL. 11

Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?
Mavuto Mukaka ... Linda Kalilani-Phiri
Trials | VOL. 17
Mavuto Mukaka, et. al.Mavuto Mukaka ... Linda Kalilani-Phiri
22 Jul 2016
Trials | VOL. 17

Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a Simulation Study
Xiaoping Zhu
Open Journal of Statistics | VOL. 04
Xiaoping ZhuXiaoping Zhu
01 Jan 2014
Open Journal of Statistics | VOL. 04

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparative Study of Four Methods in Missing Value Imputations under Missing Completely at Random Mechanism

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Open Journal of Statistics