Assessing privacy and quality of synthetic health data

Andrew Yale,Kristin P Bennett,Saloni Dash,Isabelle Guyon,Ritik Dutta,Adrien Pavao

doi:10.1145/3359115.3359124

Abstract

This paper builds on the results of the ESANN 2019 conference paper Privacy Preserving Synthetic Health Data [16], which develops metrics for assessing privacy and utility of synthetic data and models. The metrics laid out in the initial paper show that utility can still be achieved in synthetic data while maintaining both privacy of the model and the data being generated. Specifically, we focused on the success of the Wasserstein GAN method, renamed HealthGAN, in comparison to other data generating methods.In this paper, we provide additional novel metrics to quantify the susceptibility of these generative models to membership inference attacks [14]. We also introduce Discriminator Testing, a new method of determining whether the different generators overfit on the training data, potentially resulting in privacy losses.These privacy issues are of high importance as we prepare a final workflow for generating synthetic data based on real data in a secure environment. The results of these tests complement the initial tests as they show that the Parzen windows method, while having a low privacy loss in adversarial accuracy metrics, fails to preserve privacy in the membership inference attack. Only HealthGAN shows both an optimal value for privacy loss and the membership inference attack. The discriminator testing adds to the confidence as HealthGAN retains resemblance to the training data, without reproducing the training data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Assessing privacy and quality of synthetic health data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Membership inference attacks against synthetic health data
Ziqi Zhang ... Bradley A Malin
Journal of Biomedical Informatics | VOL. 125
Ziqi Zhang, et. al.Ziqi Zhang ... Bradley A Malin
14 Dec 2021
Journal of Biomedical Informatics | VOL. 125

Characterization of Synthetic Health Data Using Rule-Based Artificial Intelligence Models.
Marta Lenatti ... Vanessa Orani
IEEE Journal of Biomedical and Health Informatics | VOL. 27
Marta Lenatti, et. al.Marta Lenatti ... Vanessa Orani
01 Aug 2023
IEEE Journal of Biomedical and Health Informatics | VOL. 27

Privacy-Preserving Synthetic Location Data in the Real World
Teddy Cunningham ... Graham Cormode
-
Teddy Cunningham, et. al.Teddy Cunningham ... Graham Cormode
23 Aug 2021
23 Aug 2021

MIASec: Enabling Data Indistinguishability Against Membership Inference Attacks in MLaaS
Chen Wang ... Weijie Feng
IEEE Transactions on Sustainable Computing | VOL. 5
Chen Wang, et. al.Chen Wang ... Weijie Feng
01 Jul 2020
IEEE Transactions on Sustainable Computing | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessing privacy and quality of synthetic health data

Abstract

Talk to us

Similar Papers