Abstract
In randomized experiments with noncompliance, one might wish to focus on compliers rather than on the overall sample. In this vein, Rubin (1998) argued that testing for the complier average causal effect and averaging permutation-based p-values over the posterior distribution of the compliance types could increase power as compared to general intent-to-treat tests. The general scheme is a repeated two-step process: impute missing compliance types and conduct a permutation test with the completed data. In this paper, we explore this idea further, comparing the use of discrepancy measures—which depend on unknown but imputed parameters—to classical test statistics and contrasting different approaches for imputing the unknown compliance types. We also examine consequences of model misspecification in the imputation step, and discuss to what extent this additional modeling undercuts the advantage of permutation tests being model independent. We find that, especially for discrepancy measures, modeling choices can impact both power and validity. In particular, imputing missing compliance types under the null can radically reduce power, but not doing so can jeopardize validity. Fortunately, using covariates predictive of compliance type in the imputation can mitigate these results. We also compare this overall approach to Bayesian model-based tests, that is, tests that are directly derived from posterior credible intervals, under both correct and incorrect model specification.
Highlights
In randomized experiments, noncompliance arises when the actual treatment received does not correspond to the assigned treatment
We explore the general idea of posterior predictive Fisher randomization tests (FRT-PPs) more in depth, and conduct extensive simulation studies to show how these tests play out in practice in randomized experiments with noncompliance
Posterior predictive p-values based on the test statistic are largely unaffected by the choice of imputation method
Summary
Noncompliance arises when the actual treatment received does not correspond to the assigned treatment. An alternative to ITT analyses is to focus on the effect of the treatment on compliers, i.e., those who would take the treatment if offered and would not if not (Imbens & Angrist, 1994; Angrist et al, 1996). Identification of the CACE relies on some assumptions (monotonicity and the exclusion restriction) under which a zero average ITT effect is a necessary and sufficient condition for the CACE to be zero. Under these assumptions, a valid test for the average ITT effect would be a valid test for the CACE
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have