Plausible deniability for privacy-preserving data synthesis

Vincent Bindschaedler,Reza Shokri,Carl A Gunter

doi:10.14778/3055540.3055542

Abstract

Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of adversaries. On the other hand, rigorous methods such as the exponential mechanism for differential privacy are often computationally impractical to use for releasing high dimensional data or cannot preserve high utility of original data due to their extensive data perturbation. This paper presents a criterion called plausible deniability that provides a formal privacy guarantee, notably for releasing sensitive datasets: an output record can be released only if a certain amount of input records are indistinguishable, up to a privacy parameter. This notion does not depend on the background knowledge of an adversary. Also, it can efficiently be checked by privacy tests. We present mechanisms to generate synthetic datasets with similar statistical properties to the input data and the same format. We study this technique both theoretically and experimentally. A key theoretical result shows that, with proper randomization, the plausible deniability mechanism generates differentially private synthetic data. We demonstrate the efficiency of this generative technique on a large dataset; it is shown to preserve the utility of original data with respect to various statistical analysis and machine learning measures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Plausible deniability for privacy-preserving data synthesis

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment

Lead the way for us

Journal: Proceedings of the VLDB Endowment	Publication Date: Jan 1, 2017
Citations: 118

Similar Papers

A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge
Kato Mivule ... Claude Turner
Procedia Computer Science | VOL. 20
Kato Mivule, et. al.Kato Mivule ... Claude Turner
01 Jan 2013
Procedia Computer Science | VOL. 20

Granular data representation under privacy protection: Tradeoff between data utility and privacy via information granularity
Ge Zhang ... Zhiwu Li
Applied Soft Computing | VOL. 131
Ge Zhang, et. al.Ge Zhang ... Zhiwu Li
12 Nov 2022
Applied Soft Computing | VOL. 131

Towards an Integrated Approach for Preserving Data Utility, Privacy and Fairness
Mortaza S Bargh ... Sunil Choenni
2018 International Conference on Multidisciplinary Research | VOL. 2022
Mortaza S Bargh, et. al.Mortaza S Bargh ... Sunil Choenni
30 Dec 2022
2018 International Conference on Multidisciplinary Research | VOL. 2022

A Bayesian perspective of statistical machine learning for big data
Rajiv Sambasivan ... Sourish Das
Computational Statistics | VOL. 35
Rajiv Sambasivan, et. al.Rajiv Sambasivan ... Sourish Das
01 Apr 2020
Computational Statistics | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Plausible deniability for privacy-preserving data synthesis

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment