Generative machine learning models such as Generative Adversarial Networks (GANs) have been shown to be especially successful in generating realistic synthetic data in image and tabular domains. However, it has been shown that such generative models, as well as the generated synthetic data, can reveal information contained in their privacy-sensitive training data, and therefore must be carefully evaluated before being used. The gold standard method through which such privacy leakage can be estimated is simulating membership inference attacks (MIAs), in which an attacker attempts to learn whether a given sample was part of the training data of a generative model. The state-of-the art MIAs against generative models, however, rely on strong assumptions (knowledge of the exact training dataset size), or require a lot of computational power (to retrain many "surrogate" generative models), which make them hard to use in practice. In this work, we propose a technique for evaluating privacy risks in GANs which exploits the outputs of the discriminator part of the standard GAN architecture. We evaluate our attacks in terms of performance in two synthetic image generation applications in radiology and ophthalmology, showing that our technique provides a more complete picture of the threats by performing worst-case privacy risk estimation and by identifying attacks with higher precision than the prior work.
Read full abstract