The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond

Matthias Speidel,Jörg Drechsler,Shahab Jolani

doi:10.18637/jss.v095.i09

Matthias Speidel, Jörg Drechsler + Show 1 more

Open Access

https://doi.org/10.18637/jss.v095.i09

Copy DOI

Abstract

Applications of multiple imputation have long outgrown the traditional context of dealing with item nonresponse in cross-sectional data sets. Nowadays multiple imputation is also applied to impute missing values in hierarchical data sets, address confidentiality concerns, combine data from different sources, or correct measurement errors in surveys. However, software developments did not keep up with these recent extensions. Most imputation software can only deal with item nonresponse in cross-sectional settings and extensions for hierarchical data - if available at all - are typically limited in scope. Furthermore, to our knowledge no software is currently available for dealing with measurement error using multiple imputation approaches. The R package hmi tries to close some of these gaps. It offers multiple imputation routines in hierarchical settings for many variable types (for example, nominal, ordinal, or continuous variables). It also provides imputation routines for interval data and handles a common measurement error problem in survey data: biased inferences due to implicit rounding of the reported values. The user-friendly setup which only requires the data and optionally the specification of the analysis model of interest makes the package especially attractive for users less familiar with the peculiarities of multiple imputation. The compatibility with the popular mice package (Van Buuren and Groothuis-Oudshoorn 2011) ensures that the rich set of analysis and diagnostic tools and post-imputation functions available in mice can be used easily, once the data have been imputed.

Highlights

Forty years after Donald Rubin’s seminal paper (Rubin, 1978) which introduced the concept of multiple imputation, the approach has been shown to be useful in many contexts going far beyond the classical item nonresponse in cross sectional surveys for which it was originally proposed (Reiter/Raghunathan, 2007)
The function hmi returns two additional elements within the mids-object which are not available from mice: gibbs and pooling. The former allows checking the convergence of the gibbs-sampler chains generated by MCMCglmm
With hmi we provide comprehensive, but easy to handle tools for multiple imputation for hierarchical data sets

Summary

Introduction

Forty years after Donald Rubin’s seminal paper (Rubin, 1978) which introduced the concept of multiple imputation, the approach has been shown to be useful in many contexts going far beyond the classical item nonresponse in cross sectional surveys for which it was originally proposed (Reiter/Raghunathan, 2007). As discussed in Heitjan/Rubin (1991) coarse data are data for which the true values are not observed in a precise way This includes missing data as a special case, and rounding, grouping, censoring and interval data. It offers routines for imputing plausible values if it is only known (for some of the observations) that the exact value lies in certain intervals, for example if the data are censored Such imputation routines are only available in Stata. The package provides imputation routines for semi-continuous variables, that is, variables which have a spike at one value (typically zero), but can be considered continuous otherwise These imputation routines are available in several software packages, but are not offered in mice.

Multiple imputation for hierarchical data sets

Multilevel linear models

Multilevel generalized linear models

Dealing with missing values in hierarchical data

Multiple imputation using multilevel models

Existing imputation routines for hierarchical data and their limitations

Our contribution for the imputation of hierarchical data

Multiple imputation for interval data

Analyzing interval data

Methodology of multiple imputation for interval data

Our contribution for the imputation of interval data

Multiple imputation for data affected by heaping

Analyzing rounded data

Methodology of multiple imputation for data affected by heaping

Our contribution for the imputation of data affected by heaping

Software

Checks and preparations

Imputation cycles

The different supported types of variables

Pre-definition of the variable types

Output of hmi

Convergence checks

Pooling

Multilevel data

Before starting imputation

Running the imputation

Monitoring convergence

Analyzing the imputed data

Interval data

Some useful functions for interval data

Variables affected by heaping

Findings

Conclusion

Suggestion for rounding degrees

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Software	Publication Date: Jan 1, 2020
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Statistical Software

Lead the way for us

Similar Papers

Correction for Measurement Errors in Survey Research: Necessary and Possible
Willem E Saris ... Melanie Revilla
Social Indicators Research | VOL. 127
Willem E Saris, et. al.Willem E Saris ... Melanie Revilla
17 Jun 2015
Social Indicators Research | VOL. 127

Assessing measurement error in surveys using latent class analysis: application to self-reported illicit drug use in data from the Iranian Mental Health Survey.
Kazem Khalagi ... Masoumeh Amin-Esmaeili
Epidemiology and health | VOL. 38
Kazem Khalagi, et. al.Kazem Khalagi ... Masoumeh Amin-Esmaeili
10 Apr 2016
Epidemiology and health | VOL. 38

Using Multiple Imputation with GEE with Non-monotone Missing Longitudinal Binary Outcomes.
Stuart R Lipsitz ... Garrett M Fitzmaurice
Psychometrika | VOL. 85
Stuart R Lipsitz, et. al.Stuart R Lipsitz ... Garrett M Fitzmaurice
02 Oct 2020
Psychometrika | VOL. 85

Handling Missing Data in the Modeling of Intensive Longitudinal Data
Linying Ji ... E Mark Cummings
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 25
Linying Ji, et. al.Linying Ji ... E Mark Cummings
08 Feb 2018
Structural Equation Modeling: A Multidisciplinary Journal | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The R Package hmi: A Convenient Tool for Hierarchical Multiple Imputation and Beyond

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Statistical Software