Abstract

Regression models for correlated binary outcomes are commonly fit using a Generalized Estimating Equations (GEE) methodology. GEE uses the Liang and Zeger sandwich estimator to produce unbiased standard error estimators for regression coefficients in large sample settings even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large, and there are few repeated measurements. The sandwich estimator is not without drawbacks; its asymptotic properties do not hold in small sample settings. In these situations, the sandwich estimator is biased downwards, underestimating the variances. In this project, a modified form for the sandwich estimator is proposed to correct this deficiency. The performance of this new sandwich estimator is compared to the traditional Liang and Zeger estimator as well as alternative forms proposed by Morel, Pan and Mancl and DeRouen. The performance of each estimator was assessed with 95% coverage probabilities for the regression coefficient estimators using simulated data under various combinations of sample sizes and outcome prevalence values with an Independence (IND), Autoregressive (AR) and Compound Symmetry (CS) correlation structure. This research is motivated by investigations involving rare-event outcomes in aviation data.

Highlights

  • Regression models with binary outcome variables are prevalent in all research disciplines

  • We propose the use of an improved sandwich estimator that has the ability to produce unbiased estimates of variances and covariances in studies of correlated data with rare event and small sample sizes

  • When we analyze the sample of 30 subjects, the variability of the sandwich estimators’ variance is even larger, as reflected in their situation where the outcome is of low prevalence and the sample size is small, the choice of sandwich estimator affects the outcome of hypothesis testing concerning the regression coefficients

Read more

Summary

Introduction

Regression models with binary outcome variables are prevalent in all research disciplines. One of the strengths of using GEE is that the sandwich or robust variance estimator produces unbiased standard errors in large sample sizes for the regression coefficients even when the covariance structure is misspecified. We propose the use of an improved sandwich estimator that has the ability to produce unbiased estimates of variances and covariances in studies of correlated data with rare event and small sample sizes. Our approach will be to adjust the sandwich estimator to compensate for underestimation in these situations This adjustment is performed by taking an alternate sandwich estimator, developed by Pan, and improving its performance in small sample size and rare event settings by adding an appropriate inflation factor, while still preserving the asymptotic nature of the sandwich estimator. The results of his initial simulations using an exchangeable and independence covariance structure with both a binary and continuous outcome variable support this claim [5]

Mancl and DeRouen Estimator
Morel Estimator
Asymptotic Properties
Simulation Studies
Demonstration of Bias as a Poor Performance Measure
Coverage Probabilities
Practical Application
Method
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call