Abstract

Small area microdata contain attributes and locations of individual members of a population in small census geographies. This type of data is critical in research and policymaking, but it is often not publicly available due to confidentiality concerns. The limited access to small area microdata can result in insufficient data for certain research (data scarcity). Even for researchers qualified to access the small area microdata, their research can hardly be reproduced by others (method irreproducibility). To address these issues, we develop a method to generate small area synthetic microdata (SASM) that is suitable for public use. Specifically, an optimization approach is proposed to minimize the difference between published census tables and the SASM. Two counties in Ohio are used as case studies to test the efficacy of the proposed method and the validity of the resulting SASM. The results show that the SASM aligns not only with the census tables, but also with an external data source that contains a sample of the small area microdata. We also illustrate how the SASM can be used to address data scarcity and method irreproducibility in demographic research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call