We develop a nonparametric Bayesian analysis of regression discontinuity (RD) designs, allowing for covariates, in which we model and estimate the unknown functions of the forcing variable by basis expansion methods. In a departure from current methods, we use the entire data on the forcing variable, but we emphasize the data near the threshold by placing some knots at and near the threshold, a technique we refer to as soft-windowing. To handle the nonequally spaced knots that emerge from soft-windowing, we construct a prior on the spline coefficients, from a second-order Ornstein–Uhlenbeck process, which is hyperparameter light, and satisfies the Kullback–Leibler support property. In the fuzzy RD design, we explain the divergence between the treatment implied by the forcing variable, and the actual intake, by a discrete confounder variable, taking three values, complier, never-taker, and always-taker, and a model with four potential outcomes. Choice of the soft-window, and the number of knots, is determined by marginal likelihoods, computed by the method of Chib [Journal of the American Statistical Association, 1995, 90, 1313–1321] as a by-product of the Markov chain Monte Carlo (MCMC)-based estimation. Importantly, in each case, we allow for covariates, incorporated nonparametrically by additive natural cubic splines. The potential outcome error distributions are modeled as student-t, with an extension to Dirichlet process mixtures. We derive the large sample posterior consistency, and posterior contraction rate, of the RD average treatment effect (ATE) (in the sharp case) and RD ATE for compliers (in the fuzzy case), as the number of basis parameters increases with sample size. The excellent performance of the methods is documented in simulation experiments, and in an application to educational attainment of women from Meyersson [Econometrica, 2014, 82, 229–269].
Read full abstract