A general framework for privacy-preserving of data publication based on randomized response techniques

Chaobin Liu,Shixi Chen,Shuigeng Zhou,Jihong Guan,Yao Ma

doi:10.1016/j.is.2020.101648

Chaobin Liu, Shixi Chen + Show 3 more

Open Access

https://doi.org/10.1016/j.is.2020.101648

Copy DOI

Abstract

Privacy preserving is a paramount concern in publishing datasets that contain sensitive information. Preventing privacy disclosure and providing useful information to legitimate users for data analyzing/mining are conflicting goals. Randomized response is a class of techniques that perturbs each sensitive value in a certain way, so that personal privacy is protected while the large-trend of the entire dataset is still recoverable. However, existing randomized response techniques do not allow to flexibly configure the level of privacy protection, support only a few types of aggregate queries, and cannot achieve the best answer accuracy from perturbed data. These drawbacks impair the effectiveness of those techniques. This paper proposes a general framework based on randomized response techniques, which has good flexibility and extensibility, and can improve the effectiveness of randomized response methods. Our approach is validated by extensive experiments and comparison with existing randomized response and generalization methods.

Full Text