Abstract We consider an experiment with at least two stages or batches and O ( N ) O\left(N) subjects per batch. First, we propose a semiparametric treatment effect estimator that efficiently pools information across the batches, and we show that it asymptotically dominates alternatives that aggregate single batch estimates. Then, we consider the design problem of learning propensity scores for assigning treatment in the later batches of the experiment to maximize the asymptotic precision of this estimator. For two common causal estimands, we estimate this precision using observations from previous batches, and then solve a finite-dimensional concave maximization problem to adaptively learn flexible propensity scores that converge to suitably defined optima in each batch at rate O p ( N − 1 ⁄ 4 ) {O}_{p}\left({N}^{-1/4}) . By extending the framework of double machine learning, we show this rate suffices for our pooled estimator to attain the targeted precision after each batch, as long as nuisance function estimates converge at rate o p ( N − 1 ⁄ 4 ) {o}_{p}\left({N}^{-1/4}) . These relatively weak rate requirements enable the investigator to avoid the common practice of discretizing the covariate space for design and estimation in batch adaptive experiments while maintaining the advantages of pooling. Our numerical study shows that such discretization often leads to substantial asymptotic and finite sample precision losses outweighing any gains from design.
Read full abstract