Abstract
We extend propensity score methodology to incorporate survey weights from complex survey data and compare the use of multiple linear regression and propensity score analysis to estimate treatment effects in observational data from a complex survey. For illustration, we use these two methods to estimate the effect of gender on information technology (IT) salaries. In our analysis, both methods agree on the size and statistical significance of the overall gender salary gaps in the United States in four different IT occupations after controlling for educational and job-related covariates. Each method, however, has its own advantages which are discussed. We also show that it is important to incorporate the survey design in both linear regression and propensity score analysis. Ignoring the survey weights affects the estimates of population-level effects substantially in our analysis.
Highlights
We compare the use of multiple linear regression and propensity score analysis to estimate treatment effects in observational data arising from a complex survey
Multiple linear regression is a commonly used technique for estimating treatment effects in observational data, the statistical literature suggests that propensity score analysis has several advantages over multiple linear regression (Hill, Reiter, and Zanutto, 2004; Perkins, Tu, Underhill, Zhou, and Murray, 2000; Rubin, 1997) and is becoming more prevalent, for example, in public policy and epidemiologic research (e.g., D’Agostino, 1998; Dehejia and Wahba, 1999; Hornik et al, 2002; Perkins et al, 2000; Rosenbaum, 1986; Rubin, 1997)
If there is little or no overlap in the propensity score distributions, this is an indication that the men and women in the sample are very different and comparisons between these groups should be made with extreme caution or not at all
Summary
We compare the use of multiple linear regression and propensity score analysis to estimate treatment effects in observational data arising from a complex survey. Propensity score analysis techniques use observational data to create groups of treated and control units that have similar covariate values so that subsequent comparisons, made within these matched groups, are not confounded by differences in covariate distributions. These groups are formed by Elaine L.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.