Abstract

BackgroundMultilevel logistic regression models are widely used in health sciences research to account for clustering in multilevel data when estimating effects on subject binary outcomes of individual-level and cluster-level covariates. Several measures for quantifying between-cluster heterogeneity have been proposed. This study compared the performance of between-cluster variance based heterogeneity measures (the Intra-class Correlation Coefficient (ICC) and the Median Odds Ratio (MOR)), and cluster-level covariate based heterogeneity measures (the 80% Interval Odds Ratio (IOR-80) and the Sorting Out Index (SOI)).MethodsWe used several simulation datasets of a two-level logistic regression model to assess the performance of the four clustering measures for a multilevel logistic regression model. We also empirically compared the four measures of cluster variation with an analysis of childhood anemia to investigate the importance of unexplained heterogeneity between communities and community geographic type (rural vs urban) effect in Malawi.ResultsOur findings showed that the estimates of SOI and ICC were generally unbiased with at least 10 clusters and a cluster size of at least 20. On the other hand, estimates of MOR and IOR-80 were less accurate with 50 or fewer clusters regardless of the cluster size. The performance of the four clustering measures improved with increased clusters and cluster size at all cluster variances. In the analysis of childhood anemia, the estimate of the between-community variance was 0.455, and the effect of community geographic type (rural vs urban) had an odds ratio (OR)=1.21 (95% CI: 0.97, 1.52). The resulting estimates of ICC, MOR, IOR-80 and SOI were 0.122 (indicative of low homogeneity of childhood anemia in the same community); 1.898 (indicative of large unexplained heterogeneity); 0.345-3.978 and 56.7% (implying that the between community heterogeneity was more significant in explaining the variations in childhood anemia than the estimated effect of community geographic type (rural vs urban)), respectively.ConclusionAt least 300 clusters with sizes of at least 50 would be adequate to estimate the strength of clustering in multilevel logistic regression with negligible bias. We recommend using the SOI to assess unexplained heterogeneity between clusters when the interest also involves the effect of cluster-level covariates, otherwise, the usual intra-cluster correlation coefficient would suffice in multilevel logistic regression analyses.

Highlights

  • Multilevel logistic regression models are widely used in health sciences research to account for clustering in multilevel data when estimating effects on subject binary outcomes of individual-level and cluster-level covariates

  • For a public health specialist assessing the effect of a feeding intervention on the growth of a child, children could be nested within communities which are nested within districts

  • In conclusion, measures of the between-cluster heterogeneity and effects of cluster-level covariates in multilevel logistic regression models would be estimated with negligible bias when the data has at least 300 clusters with a cluster size of at least 50

Read more

Summary

Introduction

Multilevel logistic regression models are widely used in health sciences research to account for clustering in multilevel data when estimating effects on subject binary outcomes of individual-level and cluster-level covariates. For a public health specialist assessing the effect of a feeding intervention on the growth of a child (stunted or not), children could be nested within communities which are nested within districts. In both examples, patients or children (level-1 units) may have little variation in background characteristics within each hospital or community (level-2 units). For the strong homogeneity between patients in hospitals or children in communities, the effect would be an underestimation of the standard errors for the covariate effects, which may lead to incomplete and misleading conclusions on the association between the treatment and the viral infection or the feeding intervention and child growth [1,2,3]

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.