Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants

Odile Sauzet,Janet L Peacock

doi:10.1186/s12874-017-0369-6

Abstract

BackgroundThe analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins) and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present.MethodsUsing simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared.ResultsThe presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations.ConclusionsThis study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.

Highlights

The analysis of perinatal outcomes often involves datasets with some multiple births
Before methods to control for non independent data were widely available, researchers analysing studies among preterm infants have tended to ignore the non-independence in such data and treated the multiple births as if they were independent observations [7]
Recalling what we already mentioned in [8], researchers have discussed the methods available to deal with clustering in different contexts; Gates adjusted the standard error for a binary outcome in multiples [9], Carlin analysed twins using mixed models and generalized estimating equations (GEE) [10], Louis discussed a range of approaches including mixed models and GEEs for analysing studies of repeated pregnancies [11], and Shaffer compared mixed models and GEEs for continuous and binary outcome models without covariates [12]

Summary

Methods

Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Several methods of estimation for the logistic random intercept models and generalised estimating equations were compared

Results

Conclusions

Background

Method

Method of estimation

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Research Methodology	Publication Date: Jul 20, 2017
Citations: 11	License type: open-access

R Discovery Prime

R Discovery Prime

Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology

Lead the way for us

Similar Papers

Separation Issues and Possible Solutions: Part I – Systematic Literature Review on Logistic Models ‐ Part II – Comparison of different methods for separation in logistic regression
C Ensoy ... Tw Rakhmawati
EFSA Supporting Publications | VOL. 12
C Ensoy, et. al.C Ensoy ... Tw Rakhmawati
01 Sep 2015
EFSA Supporting Publications | VOL. 12

Misspecification and flexible random effect distributions in logistic mixed effects models applied to panel survey data
Louise Marquart-Wilson
-
Louise Marquart-WilsonLouise Marquart-Wilson
18 Nov 2016
18 Nov 2016

On selection of an appropriate logistic model to determine the risk factors of childhood stunting in Bangladesh.
Kakoli Rani Bhowmik ... Sumonkanti Das
Maternal & Child Nutrition | VOL. 15
Kakoli Rani Bhowmik, et. al.Kakoli Rani Bhowmik ... Sumonkanti Das
23 Jul 2018
Maternal & Child Nutrition | VOL. 15

On the Use of Logistic Regression Model and its Comparison with Log-binomial Regression Model in the Analysis of Poverty Data of Nepal
Krishna Prasad Acharya ... Shankar Prasad Khanal
Nepalese Journal of Statistics | VOL. 6
Krishna Prasad Acharya, et. al.Krishna Prasad Acharya ... Shankar Prasad Khanal
27 Dec 2022
Nepalese Journal of Statistics | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology