BackgroundCluster randomized trials (CRTs) are randomized trials where randomization takes place at an administrative level (e.g., hospitals, clinics, or schools) rather than at the individual level. When the number of available clusters is small, researchers may not be able to rely on simple randomization to achieve balance on cluster-level covariates across treatment conditions. If these cluster-level covariates are predictive of the outcome, covariate imbalance may distort treatment effects, threaten internal validity, lead to a loss of power, and increase the variability of treatment effects. Covariate-constrained randomization (CR) is a randomization strategy designed to reduce the risk of imbalance in cluster-level covariates when performing a CRT. Existing methods for CR have been developed and evaluated for two- and multi-arm CRTs but not for factorial CRTs.MethodsMotivated by the BEGIN study—a CRT for weight loss among patients with pre-diabetes—we develop methods for performing CR in 2 × 2 factorial cluster randomized trials with a continuous outcome and continuous cluster-level covariates. We apply our methods to the BEGIN study and use simulation to assess the performance of CR versus simple randomization for estimating treatment effects by varying the number of clusters, the degree to which clusters are associated with the outcome, the distribution of cluster level covariates, the size of the constrained randomization space, and analysis strategies.ResultsCompared to simple randomization of clusters, CR in the factorial setting is effective at achieving balance across cluster-level covariates between treatment conditions and provides more precise inferences. When cluster-level covariates are included in the analyses model, CR also results in greater power to detect treatment effects, but power is low compared to unadjusted analyses when the number of clusters is small.ConclusionsCR should be used instead of simple randomization when performing factorial CRTs to avoid highly imbalanced designs and to obtain more precise inferences. Except when there are a small number of clusters, cluster-level covariates should be included in the analysis model to increase power and maintain coverage and type 1 error rates at their nominal levels.
Read full abstract