AbstractThis paper examines existing methods of evaluating sample quality, showing that their practical utility and applicability to large-scale cross-project comparisons depends on whether they require auxiliary individual-level data. Among those methods that do not demand any such additional data, we differentiate between two approaches that rely on (i) external criteria, that is, comparisons of sample estimates to benchmarks derived from external population statistics, and (ii) internal criteria, that is, comparisons of subsample estimates to a theoretically derived aprioristic value. Our analyses demonstrate the advantages and limitations of both approaches based on an evaluation of 1,125 national surveys carried out in Europe between 2002 and 2016 within four survey projects: the Eurobarometer, European Quality of Life Survey, European Social Survey, and International Social Survey Programme. We show that the prevailing absence of design weights in cross-national survey datasets severely limits the applicability of external criteria evaluations. In contrast, using internal criteria without any weights proves acceptable because incorporating design weights in calculations of internal sample quality has only minor consequences for estimates of sample bias. Furthermore, applying internal criteria, we find that around 75 percent of samples in the four analyzed projects are not significantly biased. We also identify surveys with extremely high sample bias and investigate its potential sources. The paper concludes with recommendations regarding future research, which are directed at secondary data users, as well as producers of cross-national surveys.
Read full abstract