MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish

Ismael Garrido-Muñoz,Arturo Montejo-Ráez,Fernando Martínez-Santiago

doi:10.1007/s10579-023-09670-3

Ismael Garrido-Muñoz, Arturo Montejo-Ráez + Show 1 more

Open Access

https://doi.org/10.1007/s10579-023-09670-3

Copy DOI

Journal: Language Resources and Evaluation	Publication Date: Jul 23, 2023
Citations: 4	License type: CC BY 4.0

Affiliation: University of Jaén

Abstract

AbstractThe study of bias in language models is a growing area of work, however, both research and resources are focused on English. In this paper, we make a first approach focusing on gender bias in some freely available Spanish language models trained using popular deep neural networks, like BERT or RoBERTa. Some of these models are known for achieving state-of-the-art results on downstream tasks. These promising results have promoted such models’ integration in many real-world applications and production environments, which could be detrimental to people affected for those systems. This work proposes an evaluation framework to identify gender bias in masked language models, with explainability in mind to ease the interpretation of the evaluation results. We have evaluated 20 different models for Spanish, including some of the most popular pretrained ones in the research community. Our findings state that varying levels of gender bias are present across these models.This approach compares the adjectives proposed by the model for a set of templates. We classify the given adjectives into understandable categories and compute two new metrics from model predictions, one based on the internal state (probability) and the other one on the external state (rank). Those metrics are used to reveal biased models according to the given categories and quantify the degree of bias of the models under study.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation

Lead the way for us

Similar Papers

Surgicberta: a pre-trained language model for procedural surgical language
Marco Bombieri ... Simone Paolo Ponzetto
International Journal of Data Science and Analytics | VOL. 18
Marco Bombieri, et. al.Marco Bombieri ... Simone Paolo Ponzetto
16 Aug 2023
International Journal of Data Science and Analytics | VOL. 18

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Yi Liao ... Qun Liu
-
Yi Liao, et. al.Yi Liao ... Qun Liu
01 Jan 2020
01 Jan 2020

Gender Bias in Masked Language Models for Multiple Languages
Masahiro Kaneko ... Aizhan Imankulova
-
Masahiro Kaneko, et. al.Masahiro Kaneko ... Aizhan Imankulova
01 Jan 2021
01 Jan 2021

Gender Bias in Masked Language Models for Multiple Languages
...
-
, et. al. ...
29 Jun 2022
29 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation