Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts

Yue Guo,Ahmed Abbasi,Yi Yang

doi:10.18653/v1/2022.acl-long.72

Abstract

Human-like biases and undesired social stereotypes exist in large pretrained language models. Given the wide adoption of these models in real-world applications, mitigating such biases has become an emerging and important task. In this paper, we propose an automatic method to mitigate the biases in pretrained language models. Different from previous debiasing work that uses external corpora to fine-tune the pretrained models, we instead directly probe the biases encoded in pretrained models through prompts. Specifically, we propose a variant of the beam search method to automatically search for biased prompts such that the cloze-style completions are the most different with respect to different demographic groups. Given the identified biased prompts, we then propose a distribution alignment loss to mitigate the biases. Experiment results on standard datasets and metrics show that our proposed Auto-Debias approach can significantly reduce biases, including gender and racial bias, in pretrained language models such as BERT, RoBERTa and ALBERT. Moreover, the improvement in fairness does not decrease the language models’ understanding abilities, as shown using the GLUE benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2022
Citations: 25	License type: cc-by

Similar Papers

Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts
...
-
, et. al. ...
11 May 2022
11 May 2022

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

Understanding latent affective bias in large pre-trained neural language models
Anoop Kadan ... Lajish V.L
Natural Language Processing Journal | VOL. 7
Anoop Kadan, et. al.Anoop Kadan ... Lajish V.L
05 Mar 2024
Natural Language Processing Journal | VOL. 7

Language model as an Annotator: Unsupervised context-aware quality phrase generation
Zhihao Zhang ... Junjie Wu
Knowledge-Based Systems | VOL. 283
Zhihao Zhang, et. al.Zhihao Zhang ... Junjie Wu
16 Nov 2023
Knowledge-Based Systems | VOL. 283

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Auto-Debias: Debiasing Masked Language Models with Automated Biased Prompts

Abstract

Talk to us

Similar Papers