Small area estimation of general finite-population parameters based on grouped data

Yuki Kawakubo,Genya Kobayashi

doi:10.1016/j.csda.2023.107741

Abstract

This paper proposes a new model-based approach to small area estimation of general finite-population parameters based on grouped data or frequency data, often available from sample surveys. Grouped data contains information on frequencies of some pre-specified groups in each area, for example, the numbers of households in the income classes. Thus, grouped data provide more detailed insight into small areas than area-level aggregated data. A direct application of the widely used small area methods, such as the Fay–Herriot model for area-level data and nested error regression model for unit-level data, is not appropriate since they are not designed for grouped data. Our novel method adopts the multinomial likelihood function for the grouped data. In order to connect the group probabilities of the multinomial likelihood and the auxiliary variables within the framework of small area estimation, we introduce the unobserved unit-level quantities of interest. They follow a linear mixed model with random intercepts and dispersions after some transformation. Then the probabilities that a unit belongs to the groups can be derived and are used to construct the likelihood function for the grouped data given the random effects. The unknown model parameters (hyperparameters) are estimated by a newly developed Monte Carlo EM algorithm which uses an efficient importance sampling. The empirical best predicts (empirical Bayes estimates) of small area parameters are calculated by a simple Gibbs sampling algorithm. The numerical performance of the proposed method is illustrated based on the model-based and design-based simulations. In the application to the city-level grouped income data of Japan, we complete the patchy maps of the Gini coefficient as well as mean income across the country.

Full Text