Abstract

The estimation of average treatment effect (ATE) as a causal parameter is carried out in two steps, where in the first step, the treatment and outcome are modeled to incorporate the potential confounders, and in the second step, the predictions are inserted into the ATE estimators such as the augmented inverse probability weighting (AIPW) estimator. Due to the concerns regarding the non-linear or unknown relationships between confounders and the treatment and outcome, there has been interest in applying non-parametric methods such as machine learning (ML) algorithms instead. Some of the literature proposes to use two separate neural networks (NNs) where there is no regularization on the network’s parameters except the stochastic gradient descent (SGD) in the NN’s optimization. Our simulations indicate that the AIPW estimator suffers extensively if no regularization is utilized. We propose the normalization of AIPW (referred to as nAIPW) which can be helpful in some scenarios. nAIPW, provably, has the same properties as AIPW, that is, the double-robustness and orthogonality properties. Further, if the first-step algorithms converge fast enough, under regulatory conditions, nAIPW will be asymptotically normal. We also compare the performance of AIPW and nAIPW in terms of the bias and variance when small to moderate regularization is imposed on the NNs.

Highlights

  • Estimation of causal parameters such as the average treatment effect (ATE) in observational data requires confounder adjustment

  • We demonstrate that in the presence of strong confounders and instrumental variables (IVs), if complex neural networks without L1 regularizations are used in the step 1 estimation, both augmented inverse probability weighting (AIPW) and normalized augmented inverse probability weighting (nAIPW) estimators and their asymptotic variances perform poorly, but, relatively speaking, nAIPW performs better

  • The augmented inverse probability weighting (AIPW) estimator [21] is an improvement over single robust (SR), IPW and nIPW, which involves the predictions for both treatment, and the causal parameter can be expressed as: β=E

Read more

Summary

Introduction

Estimation of causal parameters such as the average treatment effect (ATE) in observational data requires confounder adjustment. Farrell et al [1] proposed to use two separate neural networks (double NNs or dNNs) where there is no regularization on the network’s parameters except the stochastic gradient descent (SGD) in the NN’s optimization [2,3,4,5] They derive the generalization bounds and prove that the NN’s algorithms are fast enough so that the asymptotic distribution of causal estimators such as the augmented inverse probability weighting (AIPW) estimator [6,7,8] will be asymptotically linear, under regulatory conditions and the utilization of cross-fitting [9]. IPW is proven to be a consistent estimator of ATE if the propensity scores (that are the conditional probability of treatment assignments) are estimated by a consistent parameter or non-parametric model. The proofs are straightforward but long and are included in Appendix A

Normalized Doubly Robust Estimator
Outcome and Treatment Predictions
GDR Estimator Properties
Consistency and Asymptotic Distribution of nAIPW
Doubly Robustness and Rate Doubly Robustness Properties of GDR
Robustness of nAIPW against Extreme Propensity scores
Scenario Analysis
Asymptotic Sampling Distribution of nAIPW
Asymptotic Variance of nAIPW
Monte Carlo Experiments
Simulation Results
Application
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call