Abstract

We predict the finite population proportion of a small area when individual-level data are available from a survey and more extensive household-level (not individual-level) data (covariates but not responses) are available from a census. The census and the survey consist of the same strata and primary sampling units (PSU, or wards) that are matched, but the households are not matched. There are some common covariates at the household level in the survey and the census and these covariates are used to link the households within wards. There are also covariates at the ward level, and the wards are the same in the survey and the census. Using a two-stage procedure, we study the multinomial counts in the sampled households within the wards and a projection method to infer about the non-sampled wards. This is accommodated by a multinomial-Dirichlet–Dirichlet model, a three-stage hierarchical Bayesian model for multinomial counts, as it is necessary to account for heterogeneity among the households. The key theoretical contribution of this paper is to develop a computational algorithm to sample the joint posterior density of the multinomial-Dirichlet–Dirichlet model. Specifically, we obtain samples from the distributions of the proportions for each multinomial cell. The second key contribution is to use two projection procedures (parametric based on the nested error regression model and non-parametric based on iterative re-weighted least squares), on these proportions to link the survey to the census, thereby providing a copy of the census counts. We compare the multinomial-Dirichlet–Dirichlet (heterogeneous) model and the multinomial-Dirichlet (homogeneous) model without household effects via these two projection methods. An example of the second Nepal Living Standards Survey is presented.

Highlights

  • Published: 9 June 2021In a study on health, one might need to know how many people are in excellent health, good heath, fair health, or poor health in different households within different counties in a state

  • The second Nepal Living Standards Survey has sparse counts of household members within wards for four health status groups

  • There is heterogeneity within wards, the small areas (e.g., Rao and Molina [1]), and not taking this into consideration when inference is made about the finite population proportions within each ward, could lead to biased estimates and to incorrect variability

Read more

Summary

A Bayesian Approach to Linking a Survey and a Census via

Approach to Linking a Survey and a Census via Small Areas. Stats 2021, 4, Abstract: We predict the finite population proportion of a small area when individual-level data are available from a survey and more extensive household-level (not individual-level) data (covariates but not responses) are available from a census. Using a two-stage procedure, we study the multinomial counts in the sampled households within the wards and a projection method to infer about the non-sampled wards. This is accommodated by a multinomial-Dirichlet–Dirichlet model, a three-stage hierarchical Bayesian model for multinomial counts, as it is necessary to account for heterogeneity among the households. The second key contribution is to use two projection procedures (parametric based on the nested error regression model and non-parametric based on iterative re-weighted least squares), on these proportions to link the survey to the census, thereby providing a copy of the census counts.

Introduction
Hierarchical Bayesian Models
Multinomial-Dirichlet Model
Multinomial-Dirichlet–Dirichlet Model
Bayesian Computations
Multinomial-Dirichlet Model ind
Inference for Finite Population Proportions
Iterative Re-Weighted Least Squares Method
Nested Error Regression Method
Bayesian Projective Inference
Analysis of Nepal’s Data
Census and NLSS II
Comparisons
Method
Calibration
Concluding Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call