Abstract
Abstract Survey data collection costs have risen to a point where many survey researchers and polling companies are abandoning large, expensive probability-based samples in favor of less expensive nonprobability samples. The empirical literature suggests this strategy may be suboptimal for multiple reasons, among them that probability samples tend to outperform nonprobability samples on accuracy when assessed against population benchmarks. However, nonprobability samples are often preferred due to convenience and costs. Instead of forgoing probability sampling entirely, we propose a method of combining both probability and nonprobability samples in a way that exploits their strengths to overcome their weaknesses within a Bayesian inferential framework. By using simulated data, we evaluate supplementing inferences based on small probability samples with prior distributions derived from nonprobability data. We demonstrate that informative priors based on nonprobability data can lead to reductions in variances and mean squared errors for linear model coefficients. The method is also illustrated with actual probability and nonprobability survey data. A discussion of these findings, their implications for survey practice, and possible research extensions are provided in conclusion.
Highlights
For more than a decade, the survey research industry has witnessed an increasing competition between two distinct sampling paradigms: probability and nonprobability sampling
To create the prior based on nonprobability data, we propose using the scaling factors V and k0 that depend on the potential bias in the maximum likelihood estimator (MLE) of the coefficients based on nonprobability sample data, which is assessed against the MLE based on the probability sample data
We evaluate a method of integrating relatively small probability samples with nonprobability samples to improve the efficiency and reduce the mean squared error for estimated regression coefficients
Summary
For more than a decade, the survey research industry has witnessed an increasing competition between two distinct sampling paradigms: probability and nonprobability sampling. Nonprobability sampling involves some form of arbitrary selection of elements into the sample for which inclusion probabilities are unknowable (and possibly zero for some population elements). In practice, unbiased estimation is not assured as response rates in probability surveys can be quite low. Another challenge of probability sampling is the need for large sample sizes for robust estimation, which can be problematic for survey organizations working with small- to medium-sized budgets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.