How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Natalia Ponomareva,H Brendan Mcmahan,Steve Chien,Zheng Xu,Hussein Hazimeh,Alex Kurakin,Sergei Vassilvitskii,Carson Denison,Abhradeep Guha Thakurta

doi:10.1613/jair.1.14649

Natalia Ponomareva, H Brendan Mcmahan + Show 7 more

Open Access

https://doi.org/10.1613/jair.1.14649

Copy DOI

Journal: Journal of Artificial Intelligence Research	Publication Date: Jul 23, 2023
Citations: 28	License type: publisher-specific license

Affiliation: Google (United States)

Abstract

Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP. In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees. With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research

Lead the way for us

Similar Papers

Increasing trust in complex machine learning systems
Jaehun Kim
ACM SIGIR Forum | VOL. 55
Jaehun KimJaehun Kim
01 Jun 2021
ACM SIGIR Forum | VOL. 55

Optimizing Machine Learning Models for Predictive Analytics in Cloud Environments
Krishna Kishor Tirupati ... Prof.(Dr.) Arpit Jain
International Journal for Research Publication and Seminar | VOL. 13
Krishna Kishor Tirupati, et. al. Krishna Kishor Tirupati ... Prof.(Dr.) Arpit Jain
30 Oct 2022
International Journal for Research Publication and Seminar | VOL. 13

Evaluating external generalizability of machine learning models for recycled aggregate concrete property prediction
Shreyas Pandurang Jadhav ... Nikhil Bugalia
Journal of Cleaner Production | VOL. 469
Shreyas Pandurang Jadhav, et. al.Shreyas Pandurang Jadhav ... Nikhil Bugalia
15 Jul 2024
Journal of Cleaner Production | VOL. 469

‘Emerging proxies’ in information-rich machine learning: a threat to fairness?
Aidan James Mcloughney ... Marc Cheong
-
Aidan James Mcloughney, et. al.Aidan James Mcloughney ... Marc Cheong
18 May 2023
18 May 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

Abstract

Talk to us

Similar Papers

More From: Journal of Artificial Intelligence Research