Outlier Protection in Continuous Microdata Masking

Josep Maria Mateo-Sanz,Josep Domingo-Ferrer,Francesc Sebé

doi:10.1007/978-3-540-25955-8_16

Abstract

Masking methods protect data sets against disclosure by perturbing the original values before publication. Masking causes some information loss (masked data are not exactly the same as original data) and does not completely suppress the risk of disclosure for the individuals behind the data set. Information loss can be measured by observing the differences between original and masked data while disclosure risk can be measured by means of record linkage and confidentiality intervals. Outliers in the original data set are particularly difficult to protect, as they correspond to extreme inviduals who stand out from the rest. The objective of our work is to compare, for different masking methods, the information loss and disclosure risk related to outliers. In this way, the protection level offered by different masking methods to extreme individuals can be evaluated.

Full Text