Linear area level models for small area estimation, such as the Fay-Herriot model, face challenges when applied to discrete survey data. Such data commonly arise as direct survey estimates of the number of persons possessing some characteristic, such as the number of persons in poverty. For such applications, we examine a binomial/logit normal (BLN) model that assumes a binomial distribution for rescaled survey estimates and a normal distribution with a linear regression mean function for logits of the true proportions. Effective sample sizes are defined so variances given the true proportions equal corresponding sampling variances of the direct survey estimates. We extend the BLN model to bivariate and time series (first order autoregressive) versions to permit borrowing information from past survey estimates, then apply these models to data used by the U.S. Census Bureau’s Small Area Income and Poverty Estimates (SAIPE) program to predict county poverty for school-age children. We compare prediction results from the alternative models to see how much the bivariate and time series models reduce prediction error variances from those of the univariate BLN model. Standard conditional variance calculations for corresponding linear Gaussian models that suggest how much variance reduction will be achieved from borrowing information over time with linear models agree generally with the BLN empirical results.
Read full abstract