Probability surveys are a major source of population representative data for policy research and program evaluation. However, the data come with the added complications of being observational and selected with unequal probabilities. Propensity score adjustments have become increasingly popular for inferring causal relationships in non-randomized studies, but when using survey data, estimates of the population level causal effect may be biased if the sampling design is not adequately adjusted for. The current practice of using propensity score estimators with complex surveys is somewhat ad-hoc. We propose a potential-outcome super-population framework to streamline the causal analysis. We also develop propensity-score-and-survey weighted estimators and corresponding variance estimators, as well as their asymptotic properties. Our framework clarifies the confusion regarding the use of survey weighted propensity score in practice. The choice actually depends on the available sampling weights. Various estimators are compared in a simulation study, which shows that the proposed estimators perform better than the competing methods in terms of bias and confidence interval coverage when treatment effects are heterogeneous. To address an important public health issue, we evaluate the impact of e-cigarette use on future tobacco use intention in teens, using a large nationally representative survey in the United States.
Read full abstract