Abstract

Whether and when to use sampling weights in analyses of complex survey data has been the subject of substantial debate in the statistical and epidemiologic literature; an excellent summary of the issues is given by Korn and Graubard (1999). The exchange in this journal (Kim, 2012; Kim and Kim, 2012; Lee and Kim, 2012) presents two different opinions on the issue. Here, we outline some of the challenges and controversies, and provide some guidance for future work. Complex surveys may sample certain groups or strata at higher rates than others, in order to ensure sufficient precision for group-specific estimates and for bias reduction. Observations are then assigned weights to account for this over-sampling. Additional weighting may also be applied to account for missing data (unit non-response); observations on subjects without missing data are up-weighted in order to represent comparable subjects with missing observations (Brick and Kalton, 1996). An important distinction between the two kinds of weights is that the sampling weights are usually determined by the design, planned in advance, and considered fixed and known, while the missing-data weights are based on a model of the missingness mechanism, i.e., the process that gives rise to missing data. In a complex survey the weights are designed to ensure that the sample is representative of the whole population when the weights are taken into account. Weighted estimates of population-level parameters (e.g., the mean of a continuous outcome variable) are then unbiased, meaning that if the study were repeated infinitely many times, the average parameter estimate should equal the true population parameter. There is relatively little debate on the need to use weighted analysis for this purpose (Korn and Graubard, 1999); it is essential when the mean outcome differs between sampling groups, or in the complete cases vs. the full data. Weighting may be statistically inefficient when the mean outcome does not differ between sampling groups (meaning that standard errors are large

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call