Abstract

Different methods of imputation are adopted in this study to compensate for missing values encountered in the data collected. The imputation methods considered are the Overall Mean Value, Random Overall, Logistic Regression, Linear Regression, Predictive Match, Multiple Imputations and the Hot Deck Imputation. The various values obtained by the methods were analysed and compared using Bartlett’s test statistic for equality of variances among groups (Mean Square Errors of the seven methods).The software packages used for this research work are Winmice, Solas and SAS (Winmice Prototype Version 0.1; Solas Version 3.2.; SAS Learning Edition Version 4.1). Different values were estimated applying the various methods. However, results obtained from the test showed that the variances among the groups have no significant differences, that is, any of the imputation methods could be used. Further test using relative variance revealed that the multiple imputation method may be preferred. Keywords: Missing at Random, Imputation, Bartlett’s Test, Coefficient of Relative Variance

Highlights

  • This is rarely possible since a number of factors in sample surveys. Afifi and Elashoff (1966) highlighted may lead to missing values

  • Y1...Y4, are the dependent variables while X is the independent variable Summary of the estimation of the missing values using (See the main the various imputation methods work for the data)

  • In this study we considered seven methods of imputation namely, Overall Mean Value, Random Overall, Logistic Regression, Linear Regression, Predictive Match, Multiple Imputation and Hot Deck Imputation method

Read more

Summary

There are several methods for handling missing data

This is rarely possible since a number of factors in sample surveys. Afifi and Elashoff (1966) highlighted may lead to missing values. In a sample survey because a sample unit may refuse or Lepkowski et al (1987) analyzed imputed data from a be unable to answer a particular question or due to sample survey, the National Medical Care Utilization and fatigue sensitivity or lack of knowledge or other factors, Expenditure Survey (NMCUES) which was designated to respondents not infrequently leave a particular item blank collect data about the United States Civilian Nonon mail or questionnaires or decline to give any response Institutionalized Population in 1980’s.

Hot Deck Imputation
Predictive Match Multiple Imputation
The test statistics is
Coefficient of relative variance

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.