Abstract

In this paper, as a case study, we present a systematic study of gender bias in machine translation with Google Translate. We translated sentences containing names of occupations from Hungarian, a language with gender-neutral pronouns, into English. Our aim was to present a fair measure for bias by comparing the translations to an optimal non-biased translator. When assessing bias, we used the following reference points: (1) the distribution of men and women among occupations in both the source and the target language countries, as well as (2) the results of a Hungarian survey that examined if certain jobs are generally perceived as feminine or masculine. We also studied how expanding sentences with adjectives referring to occupations effect the gender of the translated pronouns. As a result, we found bias against both genders, but biased results against women are much more frequent. Translations are closer to our perception of occupations than to objective occupational statistics. Finally, occupations have a greater effect on translation than adjectives.

Highlights

  • In recent years, there has been a growing interest in the research of machine bias, referred to as algorithmic bias

  • This paper presents a systematic study of job-related gender bias in machine translation

  • We present results on Google Translator’s bias defined in relation to the male-to-female ratio of occupations according to the Hungarian census data. 35% of the occupations were translated with an inadequate pronoun, of which 77% were translated with “he” instead of “she.” It suggests that there is a bias against both genders, biased results against women are much more common

Read more

Summary

Introduction

There has been a growing interest in the research of machine bias, referred to as algorithmic bias. The term “machine bias” describes the phenomenon that machine learning algorithms are prone to reinforce or amplify human biases (Prates et al, 2020). Over the past few years, researchers and journalists have discovered many cases when algorithms created biased results against certain social groups, making socially unjust decisions in terms of race, gender, age, or religion—a few examples of these are: gender bias in hiring algorithms (Chen et al, 2018; Dastin, 2018; Schwarm, 2018), ageist and racist ad targeting (Angwin et al, 2017; Barocas and Selbst, 2016; Chen et al, 2018), and the lack of using regional dialects in the training corpus of Natural Language Processing (NLP) algorithms (Jurgens et al, 2017)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call