Abstract

For longitudinal binary data with non-monotone non-ignorable missing outcomes over time, a full likelihood approach is complicated alge braically, and maximum likelihood estimation can be computationally pro hibitive with many times of follow-up. We propose pseudo-likelihoods to estimate the covariate effects on the marginal probabilities of the outcomes, in addition to the association parameters and missingness parameters. The pseudo-likelihood requires specification of the distribution for the data at all pairs of times on the same subject, but makes no assumptions about the joint distribution of the data at three or more times on the same sub ject, so the method can be considered semi-parametric. If using maximum likelihood, the full likelihood must be correctly specified in order to obtain consistent estimates. We show in simulations that our proposed pseudo likelihood produces a more efficient estimate of the regression parameters than the pseudo-likelihood for non-ignorable missingness proposed by Troxel et al. (1998). Application to data from the Six Cities study (Ware, et.al, 1984), a longitudinal study of the health effects of air pollution, is discussed.

Highlights

  • Longitudinal studies in which each subject is to be observed at a fixed number of time points have become very popular in social science and medical applications

  • We focus on the case where the response variable over time is binary and we are interested in modeling the marginal means or success probabilities; the time dependence or association between pairs of responses is commonly modeled in terms of pairwise correlations or odds ratios

  • We propose a pseudo-likelihood approach, based on specifying the distribution of the data at all pairs of times on the same subject; our pseudo-likelihood makes no assumptions about the joint distribution of the data at three or more times on the same subject, so the method can be considered semi-parametric in this sense

Read more

Summary

Introduction

Longitudinal studies in which each subject is to be observed at a fixed number of time points have become very popular in social science and medical applications. In the case of wheeze, it is quite plausible that a child might miss a visit because he or she is not wheezing, and did not come in for a doctor’s visit; we would expect that the parent of a child who is wheezing would be more likely to keep the doctor’s appointment, and have wheeze measured at that time point This implies that missingness in this study may depend on the unobserved outcome of interest and may be “nonignorable.”. Troxel et al (1998) proposed a pseudo-likelihood that is formed by naively assuming that the longitudinal binary measurements are independent over time Their pseudolikelihood assumes a marginal logistic regression model for the outcome at each time point, and that the missingness probability at a given time depends only on the possibly missing response at that time and the covariates (the covariates are assumed to be fully observed).

Underlying Data Model
Pseudo-Likelihood under Naive Assumption of Independence
Pseudo-Likelihood Methods with Non-Ignorable Non-Monotone Missing Outcomes
Six Cities Example
Simulation Study
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call