Evaluating the Underlying Gender Bias in Contextualized Word Embeddings

Christine Basta,Noe Casas,Marta R Costa-Jussà

doi:10.18653/v1/w19-3805

Abstract

Gender bias is highly impacting natural language processing applications. Word embeddings have clearly been proven both to keep and amplify gender biases that are present in current data sources. Recently, contextualized word embeddings have enhanced previous word embedding techniques by computing word vector representations dependent on the sentence they appear in. In this paper, we study the impact of this conceptual change in the word embedding computation in relation with gender bias. Our analysis includes different measures previously applied in the literature to standard word embeddings. Our findings suggest that contextualized word embeddings are less biased than standard ones even when the latter are debiased.

Highlights

Social biases in machine learning, in general and in natural language processing (NLP) applications in particular, are raising the alarm of the scientific community
Pretrained Language Models (LM) like ULMfit (Howard and Ruder, 2018), ELMo (Peters et al, 2018), OpenAI GPT (Radford, 2018; Radford et al, 2019) and BERT (Devlin et al, 2018), proposed different neural language model architectures and made their pre-trained weights available to ease the application of transfer learning to downstream tasks, where they have pushed the state-of-the-art for several benchmarks including question answering on SQuAD, natural language inference (NLI), cross-lingual NLI and named identity recognition (NER)
While our study cannot draw clear conclusions on whether contextualized word embeddings augment or reduce the gender bias, our results show more insights into which aspects of the final contextualized word vectors get affected by such phenomena, with a tendency more towards reducing the gender bias rather than the contrary

Summary

Introduction

Social biases in machine learning, in general and in natural language processing (NLP) applications in particular, are raising the alarm of the scientific community. Examples of these biases are evidences such that face recognition systems or speech recognition systems work better for white men than for ethnic minorities (Buolamwini and Gebru, 2018). Word embeddings representation spaces are known to present geometrical phenomena mimicking relations and analogies between words (e.g. man is to woman as king is to queen) Following this property of finding relations or analogies, one popular example of gender bias is the word association between man to computer programmer as woman to homemaker (Bolukbasi et al, 2016).

Background

Words Embeddings

Debiased Word Embeddings

Contextualized Word Embeddings

Research questions

Experimental Framework

Evaluation measures and results

Conclusions and further work