Abstract

Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is used that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. Finally, the method demonstrates sensible results when used to estimate the size of known networked populations from the National Longitudinal Study of Adolescent Health, and when used to estimate the size of a hard-to-reach population at high risk for HIV.

Highlights

  • Respondent-Driven Sampling (RDS, introduced by Heckathorn 1997) is an approach to sampling from hard-to-reach human populations in the interest of conducting statistical inference, typically on population proportions

  • RDS is often used in studies of high-risk populations such as injecting drug users, men who have sex with men, and female sex workers

  • When unit sizes are associated with sampling probability, a systematic decline in observed unit sizes over time is indicative of the depletion of the available population

Read more

Summary

Introduction

Respondent-Driven Sampling (RDS, introduced by Heckathorn 1997) is an approach to sampling from hard-to-reach human populations in the interest of conducting statistical inference, typically on population proportions. In such hard-to-reach populations, a sampling frame for the target population is not available, and members are difficult to identify or recruit from broader sampling frames.

An illustration of Respondent-Driven Sampling
Estimation of the size of population from RDS data
Overview of this paper
Bayesian inference for the population size
Likelihood for the super-population parameter
Modeling the RDS process
Bayesian inference for the unit size distribution
Estimating the size of the hidden population
Models for the unit size distribution
Prior for the unit size distribution model
Prior for the population size
Application to the National Longitudinal Study of Adolescent Health
A simulation study to assess frequentist properties
Point and interval estimation of population size
Impact of network structure
Estimation of population proportions
Findings
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call