Abstract

For regression with covariates missing not at random where the missingness depends on the missing covariate values, complete-case (CC) analysis leads to consistent estimation when the missingness is independent of the response given all covariates, but it may not have the desired level of efficiency. We propose a general empirical likelihood framework to improve estimation efficiency over the CC analysis. We expand on methods in Bartlett et al. (2014,Biostatistics 15, 719-730) and Xie and Zhang (2017, Int J Biostat 13, 1-20) that improve efficiency by modeling the missingness probability conditional on the response and fully observed covariates by allowing the possibility of modeling other data distribution-related quantities. We also give guidelines on what quantities to model and demonstrate that our proposal has the potential to yield smaller biases than existing methods when the missingness probability model is incorrect. Simulation studies are presented, as well as an application to data collected from the US National Health and Nutrition Examination Survey.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call