Cluster randomized trials are popular in health-related research due to the need or desire to randomize clusters of subjects to different trial arms as opposed to randomizing each subject individually. As outcomes from subjects within the same cluster tend to be more alike than outcomes from subjects within other clusters, an exchangeable correlation arises that is measured via the intra-cluster correlation coefficient. Intra-cluster correlation coefficient estimation is especially important due to the increasing awareness of the need to publish such values from studies in order to help guide the design of future cluster randomized trials. Therefore, numerous methods have been proposed to accurately estimate the intra-cluster correlation coefficient, with much attention given to binary outcomes. As marginal models are often of interest, we focus on intra-cluster correlation coefficient estimation in the context of fitting such a model with binary outcomes using generalized estimating equations. Traditionally, intra-cluster correlation coefficient estimation with generalized estimating equations has been based on the method of moments, although such estimators can be negatively biased. Furthermore, alternative estimators that work well, such as the analysis of variance estimator, are not as readily applicable in the context of practical data analyses with generalized estimating equations. Therefore, in this article we assess, in terms of bias, the readily available residual pseudo-likelihood approach to intra-cluster correlation coefficient estimation with the GLIMMIX procedure of SAS (SAS Institute, Cary, NC). Furthermore, we study a possible corresponding approach to confidence interval construction for the intra-cluster correlation coefficient. We utilize a simulation study and application example to assess bias in intra-cluster correlation coefficient estimates obtained from GLIMMIX using residual pseudo-likelihood. This estimator is contrasted with method of moments and analysis of variance estimators which are standards of comparison. The approach to confidence interval construction is assessed by examining coverage probabilities. Overall, the residual pseudo-likelihood estimator performs very well. It has considerably less bias than moment estimators, which are its competitor for general generalized estimating equation-based analyses, and therefore, it is a major improvement in practice. Furthermore, it works almost as well as analysis of variance estimators when they are applicable. Confidence intervals have near-nominal coverage when the intra-cluster correlation coefficient estimate has negligible bias. Our results show that the residual pseudo-likelihood estimator is a good option for intra-cluster correlation coefficient estimation when conducting a generalized estimating equation-based analysis of binary outcome data arising from cluster randomized trials. The estimator is practical in that it is simply a result from fitting a marginal model with GLIMMIX, and a confidence interval can be easily obtained. An additional advantage is that, unlike most other options for performing generalized estimating equation-based analyses, GLIMMIX provides analysts the option to utilize small-sample adjustments that ensure valid inference.
Read full abstract