Abstract

Generalized Estimating Equation (GEE) is a marginal model popularly applied for longitudinal/clustered data analysis in clinical trials or biomedical studies. We provide a systematic review on GEE including basic concepts as well as several recent developments due to practical challenges in real applications. The topics including the selection of “working” correlation structure, sample size and power calculation, and the issue of informative cluster size are covered because these aspects play important roles in GEE utilization and its statistical inference. A brief summary and discussion of potential research interests regarding GEE are provided in the end.

Highlights

  • Generalized Estimating Equation (GEE) is a general statistical approach to fit a marginal model for longitudinal/clustered data analysis, and it has been popularly applied into clinical trials and biomedical studies [1,2,3]

  • (1) The variance-covariance matrix of responses is treated as nuisance parameters in GEE and this model fitting turns out to be easier than mixed-effect models [12]

  • The variance-covariance matrix for Yi is noted by Vi = φA1i/2Ri(α)A1i/2, where Ai = Diag{](μi1), . . . , ](μini )} and the so-called “working” correlation structure Ri(α) describes the pattern of measures within subject, which is of size ni × ni and depends on a vector of association parameters denoted by α

Read more

Summary

Introduction

Generalized Estimating Equation (GEE) is a general statistical approach to fit a marginal model for longitudinal/clustered data analysis, and it has been popularly applied into clinical trials and biomedical studies [1,2,3]. The primary goal is to investigate whether there exists significant gender difference in dental growth measures and the temporal trend as age increases [4] For such data analysis, it is obvious that the responses from the same individual tend to be “more alike”; incorporating within-subject and between-subject variations into model fitting is necessary to improve efficiency of the estimation and the power [5]. For discrete random vectors, the correlation matrix was usually complicated, and it was not easy to attain multivariate distributions with specified correlation structures These limitations lead researchers to actively work on this area to develop novel methodologies. For binary longitudinal data, the estimation of the correlation coefficients was proposed based on conditional residuals [20,21,22]. Three specific topics including model selection, power analysis, and the issue of informative cluster size are mainly focused on and the recent developments are reviewed

Method
Simulation
Future Direction and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call