Abstract
Use of stepped wedge design (SWD) trials have increased exponentially over the past decade (Hooper & Eldridge, 2020). Concomitantly, due to the increasing prevalence of neck pain in the workforce, interventions are necessary and have to be evaluated. Stepped-wedge designs are adaptive, so they can and should adjust to externalities. For example, the COVID-19 pandemic introduces a period effect that could perturb the design. To understand SWD`s potential vulnerability to the secular trend of a pandemic (or other period effect in future conductance), we compare a classical SWD analysis to an item-response theory (IRT) approach that only utilizes before-after segments of the data collected. Understanding the relative-advantages of classical SWD analyses may hold import for recommendations regarding continuous data collections (i.e., response burden) under “known” externalities in future research. Several complex analytic approaches to SWDs have previously been reviewed, but a complementary approach may be a more advanced measurement theory (Li & Wang, 2022). For example, Li and Wang note that SWD analyses can be classified as either conditional (cluster-specific) or marginal (population-averaged) regression models (2022). Both of these models, however, are rooted in classical test theory. In complement, a model-based measurement approach, IRT may be adopted for analysing SWD data (Embretson, 1999). IRT assumptions enable accounting for secular trends in the SWD design, so long as within-cluster equality constraints are appropriately applied. For example, regardless of regression-based model, Li & Wang note that high-quality SWDs should report ICCs, as well as modeling assumptions for secular trends and random-effects (2022). IRT`s distribution-free parameter estimates enables meaningful assessment of change regardless of baseline values (Embretson & Poggio, 2012).RESEARCH QUESTIONS: Should all participants continue to be measured after intervention (burdensome measurement)? Can we impose a different analysis (2-time point, uncontrolled BA) on the stepped wedge design to obtain period/event-robust effect estimates? Specifically, a before-after extension of IRT`s bifactor model for assessing change is applied to the current dataset (Cai, 2010). SAMPLE & METHODS. The dataset comes from a nationally funded project entitled NEXPRO (Aegerter et al., 2022). This may be classified as a “closed-cohort” variant of SWDs (Li & Wang, 2022). In a closed-cohort design, a suitable population is identified at the beginning of the study with repeated follow-up measurement after cross-over, but no adjustments are made in terms of participant attritions and consequent additions. Approximately half of our sample (n = 120 with 4 measurements, 480 measurements) comprises employees working in the health-system education context with common neck problems. The outcome measure is the European Quality of Life instrument (EQoL-5D-5L). Three analytic estimates are reported: 1) We compare classic SWD analyses (n=296) with 2) IRT estimates from before-after segments (n=194) of the dataset, and 3) A classical paired t-test. Effect size estimates and standard errors are reported to allow interpretations of estimate bias and power, respectively. The classic SWD analysis served as “gold standard” comparator for accurate estimates that takes full power-advantage of all measurements. Specifically, a generalized linear mixed-effects model with robust estimates was used, entering random-intercepts for repeated-measurement and fixed-effects for cluster, time, and intervention to estimate changes in EQoL. RESULTS. Our “gold standard” SWD analysis yielded a significant effect of Cohen`s d = .29, SE = .009. In comparison, our newly proposed IRT model yielded a similarly significant effect of Cohen`s d = .31, but with power loss as indicated by a higher SE = .19. Finally, our “crude”, classical paired t-test yielded a greater effect size of Cohen`s d = .36, SE = .007. For IRT, the average-relative parameter bias was 7% and considered below the ignorable 10-15% threshold (Rodriguez, Reis, & Haviland, 2016). For paired t-test, the average-relative parameter bias was an unacceptable 24%.DISCUSSION. IRT reduces bias but loses power due to measurement specification. Thus, if researchers are interested in obtaining accurate (unbiased) intervention-effect estimates and wish to reduce response burden by ignoring follow-up (or pre-advanced) measurements (perhaps due to pandemic externalities, lengthy measurements, or vulnerable populations), then the IRT approach would be appropriate, if sacrificing power and potential statistical significance.CONCLUSION. An IRT-alternative to SWD designs with before-after data yields unbiased effects, but loses power. The IRT approach may be replicated in another SWD design outside of the COVID-19 "period externality" to understand it`s potential under “normal” study conditions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.