The relationship between statistical power and predictor distribution in multilevel logistic regression: a simulation-based approach

Oscar L Olvera Astivia,Martin Guhn,Anne Gadermann

doi:10.1186/s12874-019-0742-8

Oscar L Olvera Astivia, Martin Guhn + Show 1 more

Open Access

https://doi.org/10.1186/s12874-019-0742-8

Copy DOI

Abstract

BackgroundDespite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches). These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. To address both matters, we present a sample of cases documenting the influence that predictor distribution have on statistical power as well as a user-friendly, web-based application to conduct power analysis for multilevel logistic regression.MethodComputer simulations are implemented to estimate statistical power in multilevel logistic regression with varying numbers of clusters, varying cluster sample sizes, and non-normal and non-symmetrical distributions of the Level 1/2 predictors. Power curves were simulated to see in what ways non-normal/unbalanced distributions of a binary predictor and a continuous predictor affect the detection of population effect sizes for main effects, a cross-level interaction and the variance of the random effects.ResultsSkewed continuous predictors and unbalanced binary ones require larger sample sizes at both levels than balanced binary predictors and normally-distributed continuous ones. In the most extreme case of imbalance (10% incidence) and skewness of a chi-square distribution with 1 degree of freedom, even 110 Level 2 units and 100 Level 1 units were not sufficient for all predictors to reach power of 80%, mostly hovering at around 50% with the exception of the skewed, continuous Level 2 predictor.ConclusionsGiven the complex interactive influence among sample sizes, effect sizes and predictor distribution characteristics, it seems unwarranted to make generic rule-of-thumb sample size recommendations for multilevel logistic regression, aside from the fact that larger sample sizes are required when the distributions of the predictors are not symmetric or balanced. The more skewed or imbalanced the predictor is, the larger the sample size requirements. To assist researchers in planning research studies, a user-friendly web application that conducts power analysis via computer simulations in the R programming language is provided. With this web application, users can conduct simulations, tailored to their study design, to estimate statistical power for multilevel logistic regression models.

Highlights

Despite its popularity, issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation
The present study In order to address these issues and following up on the recommendations for future studies suggested by Schoeneberger [20], the purpose of this article is twofold: (i) To investigate the power of multilevel logistic regression models under commonly found conditions that may violate the assumptions made in power analysis regarding the type of predictors used; and (ii) to provide applied researchers who may be unfamiliar with the methodology of computer simulations with a user-friendly web application so that power can be
Level 2 sample sizes are shown on top of each panel in grey

Summary

Introduction

Issues concerning the estimation of power in multilevel logistic regression models are prevalent because of the complexity involved in its calculation (i.e., computer-simulation-based approaches) These issues are further compounded by the fact that the distribution of the predictors can play a role in the power to estimate these effects. Maas and Hox’s [9] and Pacagnella’s [10] simulation studies provide one of the most often-cited guidelines regarding sample sizes in multilevel models where they claim that, if fixed effects are of interest, a minimum of 30 Level 1 units and 10 Level 2 units are required and, if the inferences pertain to random effects, the number of Level 2 units should increase to 50. When these same sample size recommendations are used to estimate power, they generally fall short of the commonly recommended 80% [14, 15]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Research Methodology	Publication Date: May 9, 2019
Citations: 58	License type: open-access

R Discovery Prime

R Discovery Prime

The relationship between statistical power and predictor distribution in multilevel logistic regression: a simulation-based approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology

Lead the way for us

Similar Papers

Sample Size and Statistical Power Calculation in Genetic Association Studies
Eun Pyo Hong ... Ji Wan Park
Genomics & Informatics | VOL. 10
Eun Pyo Hong, et. al.Eun Pyo Hong ... Ji Wan Park
01 Jan 2012
Genomics & Informatics | VOL. 10

Sample Planning Optimization Tool for conservation and population Genetics (SPOTG): a software for choosing the appropriate number of markers and samples
Sean Hoban ... Giorgio Bertorelle
Methods in Ecology and Evolution | VOL. 4
Sean Hoban, et. al.Sean Hoban ... Giorgio Bertorelle
30 Jan 2013
Methods in Ecology and Evolution | VOL. 4

The self-fulfilling prophecy of post-hoc power calculations
Christos Christogiannis ... Dimitris Mavridis
American Journal of Orthodontics and Dentofacial Orthopedics | VOL. 161
Christos Christogiannis, et. al.Christos Christogiannis ... Dimitris Mavridis
28 Jan 2022
American Journal of Orthodontics and Dentofacial Orthopedics | VOL. 161

Do You Have Power? Considering Type II Error in Medical Education.
Gail M Sullivan ... Richard S Feinn
Journal of graduate medical education | VOL. 13
Gail M Sullivan, et. al.Gail M Sullivan ... Richard S Feinn
01 Dec 2021
Journal of graduate medical education | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The relationship between statistical power and predictor distribution in multilevel logistic regression: a simulation-based approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Medical Research Methodology