Bias and Fairness in Large Language Models: A Survey

Isabel O Gallegos,Sungchul Kim,Ruiyi Zhang,Ryan A Rossi,Tong Yu,Joe Barrow,Md Mehrab Tanjim,Nesreen K Ahmed,Franck Dernoncourt

doi:10.1162/coli_a_00524

Abstract

Abstract Rapid advancements of large language models (LLMs) have enabled the processing, understanding, and generation of human-like text, with increasing integration into systems that touch our social sphere. Despite this success, these models can learn, perpetuate, and amplify harmful social biases. In this article, we present a comprehensive survey of bias evaluation and mitigation techniques for LLMs. We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing, defining distinct facets of harm and introducing several desiderata to operationalize fairness for LLMs. We then unify the literature by proposing three intuitive taxonomies, two for bias evaluation, namely, metrics and datasets, and one for mitigation. Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: embeddings, probabilities, and generated text. Our second taxonomy of datasets for bias evaluation categorizes datasets by their structure as counterfactual inputs or prompts, and identifies the targeted harms and social groups; we also release a consolidation of publicly available datasets for improved access. Our third taxonomy of techniques for bias mitigation classifies methods by their intervention during pre-processing, in-training, intra-processing, and post-processing, with granular subcategories that elucidate research trends. Finally, we identify open problems and challenges for future work. Synthesizing a wide range of recent research, we aim to provide a clear guide of the existing literature that empowers researchers and practitioners to better understand and prevent the propagation of bias in LLMs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational Linguistics	Publication Date: Aug 14, 2024
Citations: 13	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Bias and Fairness in Large Language Models: A Survey

Abstract

Talk to us

Similar Papers

More From: Computational Linguistics

Lead the way for us

Similar Papers

#2924 Comparison of large language models and traditional natural language processing techniques in predicting arteriovenous fistula failure
Suman Lama ... Luca Neri
Nephrology Dialysis Transplantation | VOL. 39
Suman Lama, et. al.Suman Lama ... Luca Neri
23 May 2024
Nephrology Dialysis Transplantation | VOL. 39

Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course
Skylar Kolisko ... Carolyn Jane Anderson
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Skylar Kolisko, et. al.Skylar Kolisko ... Carolyn Jane Anderson
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

Large language models for biomedicine: foundations, opportunities, challenges, and best practices.
Satya S Sahoo ... Yanshan Wang
Journal of the American Medical Informatics Association : JAMIA | VOL. 31
Satya S Sahoo, et. al.Satya S Sahoo ... Yanshan Wang
24 Apr 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. 31

Use of SNOMED CT in Large Language Models: Scoping Review.
Eunsuk Chang ... Sumi Sung
JMIR medical informatics | VOL. 12
Eunsuk Chang, et. al.Eunsuk Chang ... Sumi Sung
07 Oct 2024
JMIR medical informatics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bias and Fairness in Large Language Models: A Survey

Abstract

Talk to us

Similar Papers

More From: Computational Linguistics