Abstract

We propose a practical methodology to protect a user's private data, when he wishes to publicly release data that is correlated with his private data, in the hope of getting some utility. Our approach relies on a general statistical inference framework that captures the privacy threat under inference attacks, given utility constraints. Under this framework, data is distorted before it is released, according to a privacy-preserving probabilistic mapping. This mapping is obtained by solving a convex optimization problem, which minimizes information leakage under a distortion constraint. We address a practical challenge encountered when applying this theoretical framework to real world data: the optimization may become untractable and face scalability issues when data assumes values in large size alphabets, or is high dimensional. Our work makes two major contributions. We first reduce the optimization size by introducing a quantization step, and show how to generate privacy mappings under quantization. Second, we evaluate our method on a dataset showing correlations between political views and TV viewing habits, and demonstrate that good privacy properties can be achieved with limited distortion so as not to undermine the original purpose of the publicly released data, e.g. recommendations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.