KoBBQ: Korean Bias Benchmark for Question Answering

Jiho Jin,Nayeon Lee,Jiseon Kim,Hwaran Lee,Alice Oh,Haneul Yoo

doi:10.1162/tacl_a_00661

Abstract

Abstract Warning: This paper contains examples of stereotypes and biases. The Bias Benchmark for Question Answering (BBQ) is designed to evaluate social biases of language models (LMs), but it is not simple to adapt this benchmark to cultural contexts other than the US because social biases depend heavily on the cultural context. In this paper, we present KoBBQ, a Korean bias benchmark dataset, and we propose a general framework that addresses considerations for cultural adaptation of a dataset. Our framework includes partitioning the BBQ dataset into three classes—Simply-Transferred (can be used directly after cultural translation), Target-Modified (requires localization in target groups), and Sample-Removed (does not fit Korean culture)—and adding four new categories of bias specific to Korean culture. We conduct a large-scale survey to collect and validate the social biases and the targets of the biases that reflect the stereotypes in Korean culture. The resulting KoBBQ dataset comprises 268 templates and 76,048 samples across 12 categories of social bias. We use KoBBQ to measure the accuracy and bias scores of several state-of-the-art multilingual LMs. The results clearly show differences in the bias of LMs as measured by KoBBQ and a machine-translated version of BBQ, demonstrating the need for and utility of a well-constructed, culturally aware social bias benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

KoBBQ: Korean Bias Benchmark for Question Answering

Abstract

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Journal: Transactions of the Association for Computational Linguistics	Publication Date: May 3, 2024
License type: CC BY 4.0

Similar Papers

Efficient handling of multilingual language models
C Fugen ... S Stuker
-
C Fugen, et. al.C Fugen ... S Stuker
30 Nov 2003
30 Nov 2003

Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course
Skylar Kolisko ... Carolyn Jane Anderson
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Skylar Kolisko, et. al.Skylar Kolisko ... Carolyn Jane Anderson
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

Improving Cross-lingual Information Retrieval on Low-Resource Languages via Optimal Transport Distillation
Zhiqi Huang ... James Allan
-
Zhiqi Huang, et. al.Zhiqi Huang ... James Allan
27 Feb 2023
27 Feb 2023

CDAIL-BIAS MEASURER: A Model Ensemble Approach for Dialogue Social Bias Measurement
Jishun Zhao ... Pengyuan Liu
-
Jishun Zhao, et. al.Jishun Zhao ... Pengyuan Liu
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

KoBBQ: Korean Bias Benchmark for Question Answering

Abstract

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics