Abstract

Microdata released for public use, such as the Samples of Anonymised Records, Labour Force Survey data, and longitudinal data, such as the Millennium Cohort Study, are subject to statistical disclosure procedures. These measures include data swapping, recoding, postrandomisation, and suppression. Considerable research has been conducted into the extent to which such measures protect respondent anonymity. However, the impact the measures have on the utility of the data is a neglected area of research. In this case study we examine the impact of standard statistical disclosure control measures on the utility of a sample census microdata set. We consulted data users about the impact of the variable recoding on the usefulness of the data and repeated published analyses on a file that had been subjected to perturbative disclosure control measures. It was found that disclosure control measures had a significant impact on the usability of the data (analytical completeness) and on the accuracy of the data in relation to the findings reached when the data were used in analyses (analytical validity). The findings should be of interest to those involved in statistical disclosure control and to data users themselves. Although further research is required in this area, we conclude that data quality assessment should be a central part of disclosure control practice and that universal standards for the relationship between disclosure control and data utility should be developed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call