A perspective on gender bias in generated text data

Thomas Hupperich

doi:10.3389/fhumd.2024.1495270

Thomas Hupperich

https://doi.org/10.3389/fhumd.2024.1495270

Copy DOI

Export

Save

Cite

Journal: Frontiers in Human Dynamics	Publication Date: Dec 24, 2024
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

Text generation by artificial intelligence became available to a broader public, latterly. This technology is based on machine learning and language models that need to be trained with input data. Many studies have focused on the distinction of human-written text. vs. generated texts but recent studies show that the underlying language models might be prone to reproduce gender bias in their output and, consequently, reinforcing gender roles and imbalances. In this paper, we give a perspective on this topic, considering both the generated text data itself and the machine learning models used for language generation. We present a case study of gender bias in generated text data and review recent literature addressing language models. Our results indicate that researching gender bias in the context of text generation faces significant challenges and that future work needs to overcome a lack of definitions as well as a lack of transparency.

Full Text