Abstract

Abstract One popular small area estimation method for estimating poverty and inequality indicators is the empirical best predictor under the unit-level nested error regression model with a continuous dependent variable. However, parameter estimation is more challenging when the response variable is grouped due to data confidentiality concerns or concerns about survey response burden. The work in this paper proposes methodology that enables fitting a nested error regression model when the dependent variable is grouped. Model parameters are then used for small area prediction of finite population parameters of interest. Model fitting in the case of a grouped response variable is based on the use of a stochastic expectation–maximization algorithm. Since the stochastic expectation–maximization algorithm relies on the Gaussian assumptions of the unit-level error terms, adaptive transformations are incorporated for handling departures from normality. The estimation of the mean squared error of the small area parameters is facilitated by a parametric bootstrap that captures the additional uncertainty due to the grouping mechanism and the possible use of adaptive transformations. The empirical properties of the proposed methodology are assessed by using model-based simulations and its relevance is illustrated by estimating deprivation indicators for municipalities in the Mexican state of Chiapas.

Highlights

  • Recent applications of small area estimation (SAE) methodologies have been concerned with the estimation of area-s­pecific income indicators, for example the median income, the head count ratio (HCR) and the Gini coefficient (Rao & Molina, 2015; Rojas-­Perilla et al, 2020; Tzavidis et al, 2018)

  • The results show that the performance of the empirical best predictor (EBP) using the stochastic expectation–­maximization (SEM) algorithm a) outperforms the estimates obtained using midpoint regression and b) is close to the performance of the EBPs when the continuous outcome is fully available

  • The paper proposes SAE methodology when working with a response variable that is grouped

Read more

Summary

Introduction

Recent applications of small area estimation (SAE) methodologies have been concerned with the estimation of area-s­pecific income indicators, for example the median income, the head count ratio (HCR) and the Gini coefficient (Rao & Molina, 2015; Rojas-­Perilla et al, 2020; Tzavidis et al, 2018). Popular SAE methods that have been used in this context include the so-c­ alled World Bank method (Elbers et al, 2003) and the empirical best predictor (EBP) method (Molina & Rao, 2010) In these papers, SAE is based on the use of a unit-l­evel nested error regression (random effects) model estimated with income or consumption as a response variable that is measured on a continuous scale. It is reasonable to expect that collecting grouped data may result in a loss of information compared to collecting on a continuous scale The impact of this loss of information on the quality of official statistics estimates is of particular importance. In the United Kingdom, the Office for National Statistics experimented with the collection of grouped income data in the lead up to the 2001 census (Collins & White, 1996)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call