Selective Differential Privacy for Language Modeling

Weiyan Shi

doi:10.48448/xadd-z853

Abstract

With the increasing applications of language models, it has become crucial to protect these models from leaking private information. Previous work has attempted to tackle this challenge by training RNN-based language models with differential privacy guarantees. However, applying classical differential privacy to language models leads to poor model performance as the underlying privacy notion is over-pessimistic and provides undifferentiated protection for all tokens in the data. Given that the private information in natural language is sparse (for example, the bulk of an email might not carry personally identifiable information), we propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data to improve model utility. To realize such a new notion, we develop a corresponding privacy mechanism, Selective-DPSGD, for RNN-based language models. Besides language modeling, we also apply the method to a more concrete application--dialog systems. Experiments on both language modeling and dialog system building show that the proposed privacy-preserving mechanism achieves better utilities while remaining safe under various privacy attacks compared to the baselines. The data and code are released at https://github.com/wyshi/lm_privacy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Selective Differential Privacy for Language Modeling

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

On the Privacy Guarantees of Gossip Protocols in General Networks
Richeng Jin ... Zhaoyang Zhang
IEEE Transactions on Network Science and Engineering | VOL. -
Richeng Jin, et. al.Richeng Jin ... Zhaoyang Zhang
01 Jan 2023
IEEE Transactions on Network Science and Engineering | VOL. -

Concentrated Differentially Private Federated Learning With Performance Analysis
Rui Hu ... Yanmin Gong
IEEE Open Journal of the Computer Society | VOL. 2
Rui Hu, et. al.Rui Hu ... Yanmin Gong
01 Jan 2020
IEEE Open Journal of the Computer Society | VOL. 2

Language Modeling with Shared Grammar
Yuyu Zhang ... Le Song
-
Yuyu Zhang, et. al.Yuyu Zhang ... Le Song
01 Jan 2019
01 Jan 2019

Differential Privacy and Prediction Uncertainty of Gossip Protocols in General Networks
Yufan Huang ... Richeng Jin
-
Yufan Huang, et. al.Yufan Huang ... Richeng Jin
01 Dec 2020
01 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Selective Differential Privacy for Language Modeling

Abstract

Talk to us

Similar Papers