An investigation of penalization and data augmentation to improve convergence of generalized estimating equations for clustered binary outcomes

Angelika Geroldinger,Rok Blagus,Georg Heinze,Helen Ogden

doi:10.1186/s12874-022-01641-6

Abstract

BackgroundIn binary logistic regression data are ‘separable’ if there exists a linear combination of explanatory variables which perfectly predicts the observed outcome, leading to non-existence of some of the maximum likelihood coefficient estimates. A popular solution to obtain finite estimates even with separable data is Firth’s logistic regression (FL), which was originally proposed to reduce the bias in coefficient estimates. The question of convergence becomes more involved when analyzing clustered data as frequently encountered in clinical research, e.g. data collected in several study centers or when individuals contribute multiple observations, using marginal logistic regression models fitted by generalized estimating equations (GEE). From our experience we suspect that separable data are a sufficient, but not a necessary condition for non-convergence of GEE. Thus, we expect that generalizations of approaches that can handle separable uncorrelated data may reduce but not fully remove the non-convergence issues of GEE.MethodsWe investigate one recently proposed and two new extensions of FL to GEE. With ‘penalized GEE’ the GEE are treated as score equations, i.e. as derivatives of a log-likelihood set to zero, which are then modified as in FL. We introduce two approaches motivated by the equivalence of FL and maximum likelihood estimation with iteratively augmented data. Specifically, we consider fully iterated and single-step versions of this ‘augmented GEE’ approach. We compare the three approaches with respect to convergence behavior, practical applicability and performance using simulated data and a real data example.ResultsOur simulations indicate that all three extensions of FL to GEE substantially improve convergence compared to ordinary GEE, while showing a similar or even better performance in terms of accuracy of coefficient estimates and predictions. Penalized GEE often slightly outperforms the augmented GEE approaches, but this comes at the cost of a higher burden of implementation.ConclusionsWhen fitting marginal logistic regression models using GEE on sparse data we recommend to apply penalized GEE if one has access to a suitable software implementation and single-step augmented GEE otherwise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An investigation of penalization and data augmentation to improve convergence of generalized estimating equations for clustered binary outcomes

Abstract

Talk to us

Similar Papers

More From: BMC Medical Research Methodology

Lead the way for us

Journal: BMC Medical Research Methodology	Publication Date: Jun 9, 2022
License type: open-access

Similar Papers

A Note on Marginal Linear Regression with Correlated Response Data
Wei Pan ... John E Connett
The American Statistician | VOL. 54
Wei Pan, et. al.Wei Pan ... John E Connett
01 Aug 2000
The American Statistician | VOL. 54

Marginal regression models with a time to event outcome and discrete multiple source predictors
Heather J Litman ... Jane M Murphy
Lifetime Data Analysis | VOL. 12
Heather J Litman, et. al.Heather J Litman ... Jane M Murphy
02 Aug 2006
Lifetime Data Analysis | VOL. 12

Statistical Methods to Study Timing of Vulnerability with Sparsely Sampled Data on Environmental Toxicants
Brisa Ney Sánchez ... Howard Hu
Environmental Health Perspectives | VOL. 119
Brisa Ney Sánchez, et. al.Brisa Ney Sánchez ... Howard Hu
08 Dec 2010
Environmental Health Perspectives | VOL. 119

Determinants of continuing mental health service use among older persons diagnosed with depressive disorders in general hospitals: latent class analysis and GEE
Thida Mulalint ... Sasima Tongsai
BMC health services research | VOL. 22
Thida Mulalint, et. al.Thida Mulalint ... Sasima Tongsai
11 Jul 2022
BMC health services research | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An investigation of penalization and data augmentation to improve convergence of generalized estimating equations for clustered binary outcomes

Abstract

Talk to us

Similar Papers

More From: BMC Medical Research Methodology