Randomized experiments involving education interventions are typically implemented as cluster randomized trials, with schools serving as clusters. To design such a study, it is critical to understand the degree to which learning outcomes vary between versus within clusters (schools), specifically the intraclass correlation coefficient. It is also helpful to anticipate the benefits, in terms of statistical power, of collecting household data, testing students at baseline, or relying on administrative data on previous cohorts from the same school. We use data from multiple cluster-randomized trials in four Latin American countries to provide information on the intraclass correlations in early grade literacy outcomes. We also describe the proportion of variance explained by different types of covariates. These parameters will help future researchers conduct statistical power analysis, estimate the required sample size, and determine the necessity of collecting different types of baseline data such as child assessments, administrative data at the school level, or household surveys.
Read full abstract