Abstract
This paper offers a comprehensive analysis of the statistical disclosure limitation (SDL) methodologies employed by the U.S. Census Bureau on the 2010 and 2020 Decennial Census releases under the perspective of the disclosure risk of the most vulnerable respondents. We first review the SDL methodology used up to the Decennial Census 2010, which was based on targeted swapping. Second, we examine recently reported reconstruction and reidentification results on the Decennial Census 2010 outputs, which form the foundation for the U.S. Census Bureau’s decision to switch to a differentially private (DP) method for the 2020 release. Third, we examine the actual privacy and data accuracy achieved by the DP method and compare with the privacy and accuracy offered by the formerly employed swapping mechanism. We conclude that the DP method is not an adequate solution to protect the typically sparse tables present in the Decennial Censuses because it does not offer meaningful privacy guarantees in general, it poorly protects the privacy of the most vulnerable respondents in particular, and it significantly degrades the quality of the released data. We also argue that the claimed disclosure risks of previous Census releases were overstated because of a flawed reidentification procedure. Therefore, the U.S. Census Bureau’s decision to change the SDL methodology to a DP-based one for the 2020 release was not only unwarranted, but it also reduced privacy and data quality compared to former releases.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have