Abstract

The aim of this article is to analyse the possibility of applying selected perturbative masking methods of Statistical Disclosure Control to microdata, i.e. unit‑level data from the Labour Force Survey. In the first step, the author assessed to what extent the confidentiality of information was protected in the original dataset. In the second step, after applying selected methods implemented in the sdcMicro package in the R programme, the impact of those methods on the disclosure risk, the loss of information and the quality of estimation of population quantities was assessed. The conclusion highlights some problematic aspects of the use of Statistical Disclosure Control methods which were observed during the conducted analysis.

Highlights

  • This article is a response to the growing demand, observed in recent years, for increasingly detailed statistical information which is of interest to both the pub‐ lic sector and the private sector

  • In the case of sample surveys, each observation may have a differ‐ ent sampling weight, so that, after generalisation, consistency is not assured. This method is recommended when there are at least six quasi‐identifiers in microdata or when the use of non‐perturbative methods would result in excessive informa‐ tion loss (Hundepool et al, 2012; Templ, 2017)

  • It should be em‐ phasised that Labour Force Sur‐ vey (LFS) microdata are affected by sampling and non‐sampling errors, and the application of Statistical Disclosure Control (SDC) methods will be another source of error

Read more

Summary

Introduction

This article is a response to the growing demand, observed in recent years, for increasingly detailed statistical information which is of interest to both the pub‐ lic sector (authorities, administration, universities, or research institutes) and the private sector (business entities). Microdata should be verified to eliminate or reduce the risk of disclosure, while simultaneous‐ ly minimising the loss of information. This process is called Statistical Disclosure Control (SDC). The aim of the article is to analyse the possibility of using selected pertur‐ bative methods to protect LFS microdata against the risk of disclosure. The second section presents selected methods of evaluating microdata confidentiality which are available in the sdcMicro package. Attention is focused on how the problem of information loss is handled by the sdcMicro package as well as on point estimates and their quality. The article ends with conclu‐ sions, a brief summary of problematic aspects of SDC for microdata, and an out‐ line of further research work

The measurement of microdata confidentiality
Selected perturbative masking methods
The measurement of microdata utility
The confidentiality of original microdata
The estimation of the unemployment rate and its precision
Findings
Conclusions and further work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call