Abstract

Privacy preserving is a paramount concern in publishing datasets that contain sensitive information. Preventing privacy disclosure and providing useful information to legitimate users for data analyzing/mining are conflicting goals. Randomized response is a class of techniques that perturbs each sensitive value in a certain way, so that personal privacy is protected while the large-trend of the entire dataset is still recoverable. However, existing randomized response techniques do not allow to flexibly configure the level of privacy protection, support only a few types of aggregate queries, and cannot achieve the best answer accuracy from perturbed data. These drawbacks impair the effectiveness of those techniques. This paper proposes a general framework based on randomized response techniques, which has good flexibility and extensibility, and can improve the effectiveness of randomized response methods. Our approach is validated by extensive experiments and comparison with existing randomized response and generalization methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call