AbstractOBJECTIVEThis paper details the validation of a methodology which spatially allocates Census microdata to census tracts, based on known, aggregate tract population distributions. To protect confidentiality, public-use microdata contain no spatial identifiers other than the code indicating the Public Use Microdata Area (PUMA) in which the individual or household is located. Confirmatory information including the location of microdata households can only be obtained in a Census Research Data Center (CRDC). Due to restrictions in place at CRDCs, a systematic procedure for validating the spatial allocation methodology needs to be implemented prior to accessing CRDC data.METHODSThis study demonstrates and evaluates such an approach, using historical census data for which a 100% count of the full population is available at a fine spatial resolution. The approach described allows for testing of the behavior of a maximum entropy imputation and spatial allocation model under different specifications. The imputation and allocation is performed using a microdata sample of records drawn from the full 1880 Census enumeration and synthetic summary files created from the same source. The results of the allocation are then validated against the actual values from the 100% count of 1880.RESULTSThe results indicate that the validation procedure provides useful statistics, allowing an in-depth evaluation of the household allocation and identifying optimal configurations for model parameterization. This provides important insights as to how to design a validation procedure at a CRDC for spatial allocations using contemporary census data.(ProQuest: ... denotes formulae omitted.)1. IntroductionCensus public-use microdata possess an attribute richness which should make them tremendously useful to researchers interested in demographic small area estimation; however, they are underutilized, largely due to their coarse spatial resolution. The smallest identifiable geographic areas in Census microdata contain a minimum of 100,000 individuals, a restriction which may significantly compromise the geographic nature of a demographic study. Research which focuses on smaller geographic areas generally relies on a limited number of aggregate population characteristics provided by the Census Bureau in summary tables and cross-tabulations at the census tract or block group level. In order to better exploit the attribute richness of Census microdata at finer spatial scales, spatial allocation methods, which allocate microdata households to small areas and generate summary statistics for these smaller geographic units using the attributes of the allocated microdata households, may be used (Johnston and Pattie 1993; Ballas et al. 2005; Assuncao et al. 2005). Small area estimates, which contain extensive detail on the underlying population, are in great demand and are important to research on demographic and social processes such as migration, impoverishment, and human-environmental interactions.A persistent shortcoming in the use of such spatial allocation methods for deriving demographic small area estimates is the lack of confirmatory validation. There are often few, if any, sources against which to compare the estimated fine-scale population counts and the associated distributions of population characteristics. The main reason for the absence of fine-resolution comparison data is the confidentiality protection in census surveys, which precludes the release of confirmatory data. Although demographic estimates based on U.S. Census data and geographies may be validated at a Census Research Data Center (CRDC), the expense of accessing a CRDC and the necessary confidentiality restrictions in place at the CRDC mandate that the validation process, which is not trivial, be fully realized prior to its implementation at the CRDC.This article describes one procedure for validating demographic small area estimates derived from spatially allocated household microdata. …
Read full abstract