Unequal Probability Sampling Research Articles

The fitting of statistical distributions to chemical and microbial contamination data is a common application in risk assessment. These distributions are used to make inferences regarding even the most pedestrian of statistics, such as the population mean. The reason for the heavy reliance on a fitted distribution is the presence of left-, right-, and interval-censored observations in the data sets, with censored observations being the result of nondetects in an assay, the use of screening tests, and other practical limitations. Considerable effort has been expended to develop statistical distributions and fitting techniques for a wide variety of applications. Of the various fitting methods, Markov Chain Monte Carlo methods are common. An underlying assumption for many of the proposed Markov Chain Monte Carlo methods is that the data represent independent and identically distributed (iid) observations from an assumed distribution. This condition is satisfied when samples are collected using a simple random sampling design. Unfortunately, samples of food commodities are generally not collected in accordance with a strict probability design. Nevertheless, pseudosystematic sampling efforts (e.g., collection of a sample hourly or weekly) from a single location in the farm-to-table continuum are reasonable approximations of a simple random sample. The assumption that the data represent an iid sample from a single distribution is more difficult to defend if samples are collected at multiple locations in the farm-to-table continuum or risk-based sampling methods are employed to preferentially select samples that are more likely to be contaminated. This paper develops a weighted bootstrap estimation framework that is appropriate for fitting a distribution to microbiological samples that are collected with unequal probabilities of selection. An example based on microbial data, derived by the Most Probable Number technique, demonstrates the method and highlights the magnitude of biases in an estimator that ignores the effects of an unequal probability sample design.

Read full abstract

Little is known about the biomarker-based prevalence of diabetes among U.S. adults aged 24-32 years, an age group historically characterized by low cardiovascular disease risk. We addressed the paucity of information within this age group among 15,701 participants at Wave IV of the National Longitudinal Study of Adolescent Health (Add Health, 2008), a study including nationally representative oversamples of racial / ethnic groups underrepresented by the National Health and Nutrition Examination Survey (NHANES). Capillary whole blood was collected via finger prick onto Whatman 903® Protein Saver cards by trained and certified field interviewers, desiccated, then shipped to central laboratories for assay and archival. Sensitivity of the glucose assay was 22 mg/dl. Assayed values in the lowest half percentile of the distribution were re-assayed. Re-assayed and original values were averaged. The within- and between-assay coefficients of variation (CVs) were 4.4% and 4.8%. For HbA 1c , the corresponding sensitivity, within- and between-assay CVs were 3%, 2.2%, and 2.4%. In paired serum and blood spots, glucose concentrations (mg/dl) were strongly associated (n = 83; Pearson r = 0.97). Associations were equally strong for HbA 1c (%) in paired whole blood and blood spots (n = 80; Pearson r = 0.99). In a race/ethnicity- and sex-stratified random sample of 100 Add Health participants among whom capillary whole blood was collected twice, one to two weeks apart, reliability of random (fasting ≥ 8 hr or non-fasting) glucose and HbA 1c was estimated as an intra-class correlation coefficient and 95% confidence interval, ICC (95% CI): 0.39 (0.21, 0.58) and 0.97 (0.96-0.98). Add Health participants were more likely than similarly aged NHANES (2007-2008) participants to be native-born, insured, college educated, and overweight or obese. After weighting for unequal sampling probabilities and clustering, mean (standard deviation) HbA 1c and fasting glucose were higher in Add Health than NHANES: 5.6% (0.8%) and 107 (35) mg/dl vs. 5.2% (0.5%) and 97 (14) mg/dl. The weighted prevalence (95% CI) of HbA 1c ≥ 6.5% and fasting glucose ≥ 126 mg/dl also were higher in Add Health than NHANES: 3.6% (2.9-4.3) and 10.3% (8.7%-12.2%) vs. 1.7% (0.9%-3.2%) and 2.1% (0.8%-5.5%). Corresponding odds ratios (95% CIs) were: 2.1 (1.1-3.9) and 5.2 (2.1-13.3). Adjustment for sociodemographic, clinical and behavioral risk factors attenuated the associations: 1.5 (0.8-3.1) and 4.2 (1.7-10.4). However, the addition of self-reported history of diabetes and use of anti-diabetics had relatively little effect on them. Carefully standardized, in-home collection of whole blood spots can yield valid and reliable estimates of glucose and HbA 1c . Their interpretation in context of the prevalent obesity and hypertension at Add Health Wave IV reinforces suggestions that young, U.S. adults face a historically high risk of cardiovascular disease.

Read full abstract

Unequal Probability Sampling Research Articles

Related Topics

Articles published on Unequal Probability Sampling

Fitting a distribution to censored contamination data using Markov Chain Monte Carlo methods and samples selected with unequal probabilities.

Estimating population mean with missing data in unequal probability sampling

Estimation of the proportion of a sensitive attribute based on a two-stage randomized response model with stratified unequal probability sampling

Applying the Nonrandomized Diagonal Model to Estimate a Sensitive Distribution in Complex Sample Surveys

Estimation of Population Mean Under Unequal Probability Sampling with Unknown Selection Probabilities

Variance Estimation and Asymptotic Confidence Bands for the Mean Estimator of Sampled Functional Data with High Entropy Unequal Probability Sampling Designs

Replication variance estimation in unequal probability sampling without replacement: One‐stage and two‐stage

Complex national sampling design for long-term monitoring of protected dry grasslands in Switzerland

A new replicate variance estimator for unequal probability sampling without replacement

Fitting distributions to microbial contamination data collected with an unequal probability sampling design

Adaptive survey designs for sampling rare and clustered populations

META-ANALYSIS OF CHOICE SET GENERATION EFFECTS ON ROUTE CHOICE MODEL ESTIMATES AND PREDICTIONS

Weighting in the regression analysis of survey data with a cross‐national application

Abstract P010: Dried Capillary Whole Blood Spot-Based Hemoglobin A 1c , Fasting Glucose, and Diabetes Prevalence in a Nationally Representative Population of Young U.S. Adults: Add Health, Wave IV

On the inclusion probabilities in some unequal probability sampling plans without replacement

Size constrained unequal probability sampling with a non-integer sum of inclusion probabilities

A Gaussian copula approach for the analysis of secondary phenotypes in case-control genetic association studies

Spatially correlated Poisson sampling

A Direct Bootstrap Method for Complex Sampling Designs From a Finite Population

An Extension of Sampford's Method for Unequal Probability Sampling

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Unequal Probability Sampling Research Articles

Related Topics

Articles published on Unequal Probability Sampling

Fitting a distribution to censored contamination data using Markov Chain Monte Carlo methods and samples selected with unequal probabilities.

Estimating population mean with missing data in unequal probability sampling

Estimation of the proportion of a sensitive attribute based on a two-stage randomized response model with stratified unequal probability sampling

Applying the Nonrandomized Diagonal Model to Estimate a Sensitive Distribution in Complex Sample Surveys

Estimation of Population Mean Under Unequal Probability Sampling with Unknown Selection Probabilities

Variance Estimation and Asymptotic Confidence Bands for the Mean Estimator of Sampled Functional Data with High Entropy Unequal Probability Sampling Designs

Replication variance estimation in unequal probability sampling without replacement: One‐stage and two‐stage

Complex national sampling design for long-term monitoring of protected dry grasslands in Switzerland

A new replicate variance estimator for unequal probability sampling without replacement

Fitting distributions to microbial contamination data collected with an unequal probability sampling design

Adaptive survey designs for sampling rare and clustered populations

META-ANALYSIS OF CHOICE SET GENERATION EFFECTS ON ROUTE CHOICE MODEL ESTIMATES AND PREDICTIONS

Weighting in the regression analysis of survey data with a cross‐national application

Abstract P010: Dried Capillary Whole Blood Spot-Based Hemoglobin A 1c , Fasting Glucose, and Diabetes Prevalence in a Nationally Representative Population of Young U.S. Adults: Add Health, Wave IV

On the inclusion probabilities in some unequal probability sampling plans without replacement

Size constrained unequal probability sampling with a non-integer sum of inclusion probabilities

A Gaussian copula approach for the analysis of secondary phenotypes in case-control genetic association studies

Spatially correlated Poisson sampling

A Direct Bootstrap Method for Complex Sampling Designs From a Finite Population

An Extension of Sampford's Method for Unequal Probability Sampling