Weight Trimming and Propensity Score Weighting

Brian K Lee,Elizabeth A Stuart,Justin Lessler,Giuseppe Biondi-Zoccai

doi:10.1371/journal.pone.0018174

Brian K Lee, Elizabeth A Stuart + Show 2 more

Open Access

https://doi.org/10.1371/journal.pone.0018174

Copy DOI

Journal: PloS one	Publication Date: Mar 31, 2011
Citations: 394	License type: CC BY 4.0

Affiliation: Drexel University, Johns Hopkins University

Abstract

Propensity score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve the performance of propensity score weighting and whether the benefits of trimming differ by propensity score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate propensity score weights. Results indicate that although misspecified logistic regression propensity score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated propensity scores. While trimming may be used to optimize propensity score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of propensity score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the propensity score model) rather than relying on ad-hoc methods such as weight trimming.

Highlights

Propensity score methods are a means of controlling for confounding in non-experimental studies [1]
In a previous study of propensity score estimation using classification and regression tree (CART) methods, we found that certain machine learning data fitting methods could provide substantially better bias reduction and confidence interval coverage compared with logistic regression [24]
We evaluate the performance of weight trimming by examining the bias, 95% confidence interval (CI) coverage, and standard error of effect estimates

Summary

Introduction

Propensity score methods are a means of controlling for confounding in non-experimental studies [1]. The propensity score is the probability of receiving a treatment conditional on observed covariates. By conditioning on the propensity score one can achieve an unbiased estimate of the treatment effect, assuming no unmeasured confounding. Conditioning on the propensity score typically occurs through weighting, matching, stratification, or regression adjustment. Any of these methods can be used for propensity score adjustment, some evidence suggests that weighting and matching may be optimal in some instances [2]. In studies involving complex sampling methods where units have differential probabilities of inclusion, propensity score weighting may be recommended [3]

Methods

Results

Conclusion