Biostatistics of generalized estimating equations in developmental medicine and child neurology.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

This review provides clinicians and researchers in developmental medicine and paediatric neurology with a guide to using generalized estimating equations (GEEs) for longitudinal paediatric data. We present a concise primer on core GEE concepts for non-statistical audiences, emphasizing paediatric applications. Using a randomized trial of oestrogen versus placebo for postnatal depression, we provide a reproducible workflow (in R code) for continuous and binary outcomes. We compare exchangeable and autoregressive (first-order autoregressive model) working correlations and discuss implications for efficiency and interpretation. Because the data set is maternal and contains no child outcomes, we treat it as a perinatal case study relevant to child development and use it to illustrate marginal (population-averaged) inference in longitudinal clinical data. GEEs yielded stable marginal estimates across correlation structures when the mean model was correctly specified. Oestrogen was associated with significantly lower odds of postnatal depression than placebo, with negligible differences in model fit (correlation information criterion). Statistical choices mainly affected efficiency and standard errors rather than effect sizes. GEEs provide a robust, interpretable framework for analysing correlated outcomes in paediatric research. Paired with a reproducible example, this helps clinicians and researchers select appropriate models, report working correlations transparently, and interpret marginal effects in practice.

Similar Papers
  • Research Article
  • Cite Count Icon 170
  • 10.1111/j.1469-8749.2002.tb00330.x
Two distinct forms of minor neurological dysfunction: perspectives emerging from a review of data of the Groningen Perinatal Project
  • Aug 1, 2002
  • Developmental Medicine & Child Neurology
  • Mijna Hadders‐Algra

Two distinct forms of minor neurological dysfunction: perspectives emerging from a review of data of the Groningen Perinatal Project

  • Research Article
  • Cite Count Icon 4
  • 10.1161/circulationaha.108.836767
Letter by Patel Regarding Article, “A Primer in Longitudinal Data Analysis”
  • Jul 27, 2009
  • Circulation
  • Chirag B Patel

HomeCirculationVol. 120, No. 4Letter by Patel Regarding Article, “A Primer in Longitudinal Data Analysis” Free AccessLetterPDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyReddit Jump toFree AccessLetterPDF/EPUBLetter by Patel Regarding Article, “A Primer in Longitudinal Data Analysis” Chirag B. Patel, MSE Chirag B. PatelChirag B. Patel Department of Diagnostic and Interventional Imaging, University of Texas Medical School at Houston, Houston, Tex Search for more papers by this author Originally published28 Jul 2009https://doi.org/10.1161/CIRCULATIONAHA.108.836767Circulation. 2009;120:e25To the Editor:I read with great interest the article titled “A Primer in Longitudinal Data Analysis” by Fitzmaurice and Ravichandran.1 Focusing on longitudinal data, the authors highlighted the important chasm between advancements in statistical methods and the analysis of current biomedical studies. Furthermore, the pros and cons of 2 particular approaches (analysis of response profiles and linear mixed-effects models) were well explained through case examples of previous studies. However, with the exception of a passing mention of cited sources for further reading (references 7 and 9 in the original article), the authors did not explain another important approach for analyzing longitudinal data: generalized estimating equations (GEEs).2GEEs can be used to model correlated data from repeated measures over the course of a longitudinal study. With respect to the defining features of longitudinal studies explained by Fitzmaurice and Ravichandran (eg, covariance structure and balanced versus unbalanced designs), GEEs have been shown to be more robust when missing data, imputation techniques, and other factors are considered.3 Of particular interest to longitudinal clinical studies is the use of GEEs to identify the best correlation structure and subset of covariates for a given model. Seemingly conflicting reports on the inefficiency of GEEs compared with independence estimating equations4,5 can be explained by differences in the type of data and covariate and correlation structures under study. The use of more recent models, such as the conditional second-order GEE estimator,6 has yielded improved efficiency. Furthermore, GEE models can be implemented in the software packages discussed by Fitzmaurice and Ravichandran.Clinician investigators and article reviewers would benefit greatly from knowing which model for the analysis of longitudinal data (eg, analysis of response profiles, linear mixed-effects models, GEE, independence estimating equation, or conditional second-order GEE) is most apt given a study’s design, data structure, and other relevant factors. Moreover, and in line with Fitzmaurice and Ravichandran’s intent, such an understanding will lead to an improved interpretation of longitudinal study results. This would get us “closer to reality” in terms of understanding the true impact of devices, pharmaceuticals, and other interventions for improved care of patients with cardiovascular risk factors and disease.DisclosuresNone. References 1 Fitzmaurice GM, Ravichandran C. A primer in longitudinal data analysis. Circulation. 2008; 118: 2005–2010.LinkGoogle Scholar2 Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988; 44: 1049–1060.CrossrefMedlineGoogle Scholar3 Twisk J, de Vente W. Attrition in longitudinal studies: how to deal with missing data. J Clin Epidemiol. 2002; 55: 329–337.CrossrefMedlineGoogle Scholar4 Sutradhar BC, Das K. On the efficiency of regression estimators in generalised linear models for longitudinal data. Biometrika. 1999; 86: 459–465.CrossrefGoogle Scholar5 Fitzmaurice GM. A caveat concerning independence estimating equations with multivariate binary data. Biometrics. 1995; 51: 309–317.CrossrefMedlineGoogle Scholar6 Vonesh EF, Wang H, Nie LDM. Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. J Amer Statistical Assoc. 2002; 97: 271–283.CrossrefGoogle Scholar Previous Back to top Next FiguresReferencesRelatedDetailsCited By Rahman S, Sullivan J, Barger L, St. Hilaire M, O’Brien C, Stone K, Phillips A, Klerman E, Qadri S, Wright K, Halbower A, Segar J, McGuire J, Vitiello M, de la Iglesia H, Poynter S, Yu P, Sanderson A, Zee P, Landrigan C, Czeisler C and Lockley S (2021) Extended Work Shifts and Neurobehavioral Performance in Resident-Physicians, Pediatrics, 10.1542/peds.2020-009936, 147:3, Online publication date: 1-Mar-2021. July 28, 2009Vol 120, Issue 4 Advertisement Article InformationMetrics https://doi.org/10.1161/CIRCULATIONAHA.108.836767PMID: 19635976 Originally publishedJuly 28, 2009 PDF download Advertisement SubjectsEpidemiology

  • Research Article
  • Cite Count Icon 117
  • 10.1111/j.1469-8749.2002.tb00287.x
Visual impairment in infancy: impact on neurodevelopmental and neurobiological processes
  • Nov 1, 2002
  • Developmental Medicine & Child Neurology
  • Patricia M Sonksen + 1 more

Visual impairment in infancy: impact on neurodevelopmental and neurobiological processes

  • Research Article
  • Cite Count Icon 28
  • 10.1111/j.1469-8749.2000.tb00371.x
Evidence of the effects of intrathecal baclofen for spastic and dystonic cerebral palsy
  • Sep 1, 2000
  • Developmental Medicine & Child Neurology
  • Charlene Butler + 17 more

Evidence of the effects of intrathecal baclofen for spastic and dystonic cerebral palsy

  • Research Article
  • Cite Count Icon 300
  • 10.1152/jn.1944.7.6.391
PIAL CIRCULATION AND SPREADING DEPRESSION OF ACTIVITY IN THE CEREBRAL CORTEX
  • Nov 1, 1944
  • Journal of Neurophysiology
  • Aristides A P Leo

PIAL CIRCULATION AND SPREADING DEPRESSION OF ACTIVITY IN THE CEREBRAL CORTEX

  • Front Matter
  • Cite Count Icon 1
  • 10.1016/j.pmrj.2018.11.002
Dealing With Binary Repeated Measures Data
  • Nov 22, 2018
  • PM&R
  • Christophe Toukam Tchakoute + 1 more

Dealing With Binary Repeated Measures Data

  • Research Article
  • Cite Count Icon 14
  • 10.1002/sim.962
Application of robust estimating equations to the analysis of quantitative longitudinal data.
  • Nov 9, 2001
  • Statistics in Medicine
  • Mingxiu Hu + 1 more

A model fit by general estimating equations (GEE) has been used extensively for the analysis of longitudinal data in medical studies. To some extent, GEE tries to minimize a quadratic form of the residuals, and therefore is not robust in the sense that it, like least squares estimates, is sensitive to heavy-tailed distributions, contaminated distributions and extreme values. This paper describes the family of truncated robust estimating equations and its properties for the analysis of quantitative longitudinal data. Like GEE, the robust estimating equations aim to assess the covariate effects in the generalized linear model in the complete population of observations, but in a manner that is more robust to the influence of aberrant observations. A simulation study has been conducted to compare the finite-sample performance of GEE and the robust estimating equations under a variety of error distributions and data structures. It shows that the parameter estimates based on GEE and the robust estimating equations are approximately unbiased and the type I errors of Wald tests do not tend to be inflated. GEE is slightly more efficient with pure normal data, but the efficiency of GEE declines much more quickly than the robust estimating equations when the data become contaminated or have heavy tails, which makes the robust estimating equations advantageous with non-normal data. Both GEE and the robust estimating equations are applied to a longitudinal analysis of renal function in the Diabetes Control and Complications Trial (DCCT). For this application, GEE seems to be sensitive to the working correlation specification in that different working correlation structures may lead to different conclusions about the effect of intensive diabetes treatment. On the other hand, the robust estimating equations consistently conclude that the treatment effect is highly significant no matter which working correlation structure is used. The DCCT Research Group also demonstrated a significant effect using a mixed-effects longitudinal model.

  • Research Article
  • Cite Count Icon 2
  • 10.1111/j.1469-8749.2000.tb00107.x
‘Quality of life in families of children with disabilities’
  • May 1, 2000
  • Developmental Medicine & Child Neurology
  • H Bode + 2 more

Developmental Medicine & Child NeurologyVolume 42, Issue 5 p. 354-354 Free Access ‘Quality of life in families of children with disabilities’ H Bode, H Bode University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this authorK Weidner, K Weidner University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this authorM Storck, M Storck University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this author H Bode, H Bode University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this authorK Weidner, K Weidner University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this authorM Storck, M Storck University Children's Hospital Department of Social Pediatrics and Child Neurology Schillerstr5 D89077 Ulm Germany The full questionnaire and details of the study are available from the first author. E-mail: harald.bode@medizin.uni-ulm.deSearch for more papers by this author First published: 13 February 2007 https://doi.org/10.1111/j.1469-8749.2000.tb00107.xCitations: 3AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat No abstract is available for this article. Reference 1 McLaughlin JF, Bjornson KF. (1998) Quality of life and developmental disabilities. Developmental Medicine & Child Neurology 40: 435. (Editorial.). 2 Ronen GM, Rosenbaum P, Law M, Streiner DL. (1999) Health-related quality of life in childhood epilepsy: the results of children's participation in identifying the components. Developmental Medicine & Child Neurology 41: 554– 9. 3 Andrew FW, Withey SB. (1976) Social Indicators of Well-Being. Plenum Press: London . Citing Literature Volume42, Issue5May 2000Pages 354-354 ReferencesRelatedInformation

  • Research Article
  • 10.1027/0044-3409.16.3.147
Human Motor Behavior
  • Jan 1, 2008
  • Zeitschrift für Psychologie / Journal of Psychology
  • Christa Einspieler + 2 more

Human Motor Behavior

  • Research Article
  • Cite Count Icon 1
  • 10.5336/biostatic.2017-55668
Vaka-Kontrol Çalışmalarında Uzun Süreli Verilere Sahip Çok-Yönlü Çapraz Tablolar için Genelleştirilmiş Tahmin Denklemleri Yaklaşımı
  • Jan 1, 2017
  • Turkiye Klinikleri Journal of Biostatistics
  • Melike Kaya Bahçeci̇tapar + 1 more

Objective: Log-linear analysis is a classical statistical method for the analysis of association between variables in multi-way contingency tables. Generalized Estimating Equations (GEEs) approach is popular especially for analyzing longitudinal data, since it enables to take into account the correlation of repeated measures over time within subjects by defining a so-called ''working correlation structure''. GEEs provide consistent regression parameter estimates even if working correlation structure is misspecified. In this paper, we suggest GEEs approach to the analysis of a multi-way contingency table with longitudinal data, which consists of more than one contingency tables obtained over time in case-control studies and examine the method of GEEs by considering four different working correlation structures for correlations between longitudinal count data in the table. Material and Methods: Log-linear analysis and GEEs method for longitudinal data in the multi-way contingency table with time factor are performed by SAS 9.4 statistical software program. A real genetic association case-control study with longitudinal data was illustrated to compare both methods. Results: Using either the classical log-linear analysis or GEEs method with an independent (IND) working correlation structure generates similar results for parameter estimates. It is found that linear model fitting longitudinal data in the multi-way contingency table is observed to be same for both log-linear analysis and GEEs approach performed under all correlation structures. Conclusion: This study indicates that GEEs approach provides more efficient and unbiased regression parameter estimates for the multi-way contingency table designed by responses measured repeatedly over time in the case-control study.

  • Research Article
  • Cite Count Icon 50
  • 10.1111/j.1541-0420.2012.01758.x
Model Selection for Generalized Estimating Equations Accommodating Dropout Missingness
  • Mar 29, 2012
  • Biometrics
  • Chung‐Wei Shen + 1 more

The generalized estimating equation (GEE) has been a popular tool for marginal regression analysis with longitudinal data, and its extension, the weighted GEE approach, can further accommodate data that are missing at random (MAR). Model selection methodologies for GEE, however, have not been systematically developed to allow for missing data. We propose the missing longitudinal information criterion (MLIC) for selection of the mean model, and the MLIC for correlation (MLICC) for selection of the correlation structure in GEE when the outcome data are subject to dropout/monotone missingness and are MAR. Our simulation results reveal that the MLIC and MLICC are effective for variable selection in the mean model and selecting the correlation structure, respectively. We also demonstrate the remarkable drawbacks of naively treating incomplete data as if they were complete and applying the existing GEE model selection method. The utility of proposed method is further illustrated by two real applications involving missing longitudinal outcome data.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 19
  • 10.1186/1471-2288-8-28
Comparison of generalized estimating equations and quadratic inference functions using data from the National Longitudinal Survey of Children and Youth (NLSCY) database
  • May 9, 2008
  • BMC Medical Research Methodology
  • Adefowope Odueyungbo + 3 more

BackgroundThe generalized estimating equations (GEE) technique is often used in longitudinal data modeling, where investigators are interested in population-averaged effects of covariates on responses of interest. GEE involves specifying a model relating covariates to outcomes and a plausible correlation structure between responses at different time periods. While GEE parameter estimates are consistent irrespective of the true underlying correlation structure, the method has some limitations that include challenges with model selection due to lack of absolute goodness-of-fit tests to aid comparisons among several plausible models. The quadratic inference functions (QIF) method extends the capabilities of GEE, while also addressing some GEE limitations.MethodsWe conducted a comparative study between GEE and QIF via an illustrative example, using data from the "National Longitudinal Survey of Children and Youth (NLSCY)" database. The NLSCY dataset consists of long-term, population based survey data collected since 1994, and is designed to evaluate the determinants of developmental outcomes in Canadian children. We modeled the relationship between hyperactivity-inattention and gender, age, family functioning, maternal depression symptoms, household income adequacy, maternal immigration status and maternal educational level using GEE and QIF. Basis for comparison include: (1) ease of model selection; (2) sensitivity of results to different working correlation matrices; and (3) efficiency of parameter estimates.ResultsThe sample included 795, 858 respondents (50.3% male; 12% immigrant; 6% from dysfunctional families). QIF analysis reveals that gender (male) (odds ratio [OR] = 1.73; 95% confidence interval [CI] = 1.10 to 2.71), family dysfunctional (OR = 2.84, 95% CI of 1.58 to 5.11), and maternal depression (OR = 2.49, 95% CI of 1.60 to 2.60) are significantly associated with higher odds of hyperactivity-inattention. The results remained robust under GEE modeling. Model selection was facilitated in QIF using a goodness-of-fit statistic. Overall, estimates from QIF were more efficient than those from GEE using AR (1) and Exchangeable working correlation matrices (Relative efficiency = 1.1117; 1.3082 respectively).ConclusionQIF is useful for model selection and provides more efficient parameter estimates than GEE. QIF can help investigators obtain more reliable results when used in conjunction with GEE.

  • Research Article
  • Cite Count Icon 3
  • 10.1044/sasd15.3.3
Assessment and Management Considerations for Oral Feeding of the Premature Infant on the Neonatal Intensive Care Unit
  • Oct 1, 2006
  • Perspectives on Swallowing and Swallowing Disorders (Dysphagia)
  • Amy S Faherty

Assessment and Management Considerations for Oral Feeding of the Premature Infant on the Neonatal Intensive Care Unit

  • Research Article
  • Cite Count Icon 46
  • 10.1002/sim.6198
Small sample GEE estimation of regression parameters for longitudinal data.
  • May 4, 2014
  • Statistics in Medicine
  • Sudhir Paul + 1 more

Longitudinal (clustered) response data arise in many bio-statistical applications which, in general, cannot be assumed to be independent. Generalized estimating equation (GEE) is a widely used method to estimate marginal regression parameters for correlated responses. The advantage of the GEE is that the estimates of the regression parameters are asymptotically unbiased even if the correlation structure is misspecified, although their small sample properties are not known. In this paper, two bias adjusted GEE estimators of the regression parameters in longitudinal data are obtained when the number of subjects is small. One is based on a bias correction, and the other is based on a bias reduction. Simulations show that the performances of both the bias-corrected methods are similar in terms of bias, efficiency, coverage probability, average coverage length, impact of misspecification of correlation structure, and impact of cluster size on bias correction. Both these methods show superior properties over the GEE estimates for small samples. Further, analysis of data involving a small number of subjects also shows improvement in bias, MSE, standard error, and length of the confidence interval of the estimates by the two bias adjusted methods over the GEE estimates. For small to moderate sample sizes (N ≤50), either of the bias-corrected methods GEEBc and GEEBr can be used. However, the method GEEBc should be preferred over GEEBr, as the former is computationally easier. For large sample sizes, the GEE method can be used.

  • Research Article
  • Cite Count Icon 20
  • 10.1289/ehp.1102453
Statistical Methods to Study Timing of Vulnerability with Sparsely Sampled Data on Environmental Toxicants
  • Dec 8, 2010
  • Environmental Health Perspectives
  • Brisa Ney Sánchez + 3 more

Statistical Methods to Study Timing of Vulnerability with Sparsely Sampled Data on Environmental Toxicants

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon