Abstract

We propose a method for estimating the regression parameters in a linear regression model for Gaussian data when the outcome variable is missing for some subjects and missingness is thought to be nonignorable. Throughout, we assume that missingness is restricted to the outcome variable and that the covariates are fully observed. Although maximum likelihood estimation of the regression parameters is possible once joint models for the outcome variable and the nonignorable missing data mechanism have been specified, these models are fundamentally nonidentifiable unless unverifiable modeling assumptions are imposed. In this paper, rather than explicitly modeling the nonignorable missingness mechanism, we consider the use of a ‘protective’ estimator of the regression parameters (Brown, 1990). To implement the proposed method, it is necessary to assume that the outcome variable and one of the covariates have an approximate bivariate normal distribution, conditional on the remaining covariates. In addition, it is assumed that the missing data mechanism is conditionally independent of this covariate, given the outcome variable and the remaining covariates; the latter is referred to as the ‘protective’ assumption. A method of moments approach is used to obtain the protective estimator of the regression parameters; the jackknife (Quenouille, 1956) is used to estimate the variance. The method is illustrated using data on the persistence of maternal smoking from the Six Cities Study of the health effects of air pollution (Ware et al., 1984). The results of a simulation study are presented that examine the magnitude of any finite sample bias.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call