Modeling and Integrating Background Knowledge in Data Anonymization

Ninghui Li,Jian Zhang,Tiancheng Li

doi:10.1109/icde.2009.86

Abstract

Recent work has shown the importance of considering the adversary's background knowledge when reasoning about privacy in data publishing. However, it is very difficult for the data publisher to know exactly the adversary's background knowledge. Existing work cannot satisfactorily model background knowledge and reason about privacy in the presence of such knowledge. This paper presents a general framework for modeling the adversary's background knowledge using kernel estimation methods. This framework subsumes different types of knowledge (e.g., negative association rules) that can be mined from the data. Under this framework, we reason about privacy using Bayesian inference techniques and propose the skyline (B, t)-privacy model, which allows the data publisher to enforce privacy requirements to protect the data against adversaries with different levels of background knowledge. Through an extensive set of experiments, we show the effects of probabilistic background knowledge in data anonymization and the effectiveness of our approach in both privacy protection and utility preservation.

Full Text