Exploring and Repairing Gender Fairness Violations in Word Embedding-based Sentiment Analysis Model through Adversarial Patches

Lin Sze Khoo,Zhou Yang,Chun Yong Chong,Mei Kuan Lim,Ming Lee Kimberly Yap,Jia Qi Bay,David Lo

doi:10.1109/saner56733.2023.00066

Lin Sze Khoo, Zhou Yang + Show 5 more

Open Access

https://doi.org/10.1109/saner56733.2023.00066

Copy DOI

Export

Save

Cite

Publication Date: Mar 1, 2023
Citations: 3	License type: cc-by-nc-nd

Affiliation: Monash University, Singapore Management University

Abstract
Full-Text
Similar Papers

Abstract

Listen

With the advancement of sentiment analysis (SA) models and their incorporation into our daily lives, fairness testing on these models is crucial, since unfair decisions can cause discrimination to a large population. Nevertheless, some challenges in fairness testing include the unknown oracle, the difficulty in generating suitable test inputs, and the lack of a reliable way of fixing the issues. To fill in these gaps, BiasRV, a tool based on metamorphic testing (MT), was introduced and succeeded in uncovering fairness issues in a transformer-based model. However, the extent of unfairness in other SA models has not been thoroughly investigated. Our work conducts a more comprehensive empirical study to reveal the extent of fairness violations, specifically gender fairness, exhibited by other popular word embedding-based SA models. We define fairness violation as the behavior in which an SA model predicts variants created from a text, which merely differ in gender classes, to have different sentiments. Our inspection utilizing BiasRV uncovers at least 30 fairness violations (at BiasRV’s default threshold) in all three SA models. Realizing the importance of addressing such significant violations, we introduce adversarial patches (AP) as a way of patch generation in an automated program repair (APR) system to fix them. We adopt adversarial fine-tuning in AP by retraining SA models using adversarial examples, which are bias-uncovering test cases dynamically generated by a tool named BiasFinder at runtime. Evaluation of the SA models shows that our proposed AP reduces fairness violations by at least 25%.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Exploring and Repairing Gender Fairness Violations in Word Embedding-based Sentiment Analysis Model through Adversarial Patches

Abstract

Published Version

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An Experimental Study Using Deep Neural Networks to Predict the Recurrence Risk of Brain Tumor Glioblastoma Multiform
Disha Sushant Wankhede, Chetan J Shelke
Journal of Electrical Systems | VOL. 20
Disha Sushant Wankhede, Chetan J ShelkeDisha Sushant Wankhede, Chetan J Shelke
28 Mar 2024
Journal of Electrical Systems | VOL. 20

Design of Hotel Room Experience Based on Virtual Reality Technology
Jun Wang Jun Wang
Journal of Electrical Systems | VOL. 20
Jun Wang Jun WangJun Wang Jun Wang
25 Jan 2024
Journal of Electrical Systems | VOL. 20

Forum Text Processing and Summarization
Yen-Wei Mak ... Hui-Ngo Goh
JOIV : International Journal on Informatics Visualization | VOL. 8
Yen-Wei Mak, et. al.Yen-Wei Mak ... Hui-Ngo Goh
31 Mar 2024
JOIV : International Journal on Informatics Visualization | VOL. 8

Deep learning-based user experience evaluation in distance learning.
Rahim Sadigov ... Cagatay Catal
Cluster Computing | VOL. 27
Rahim Sadigov, et. al.Rahim Sadigov ... Cagatay Catal
08 Jan 2023
Cluster Computing | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Exploring and Repairing Gender Fairness Violations in Word Embedding-based Sentiment Analysis Model through Adversarial Patches

Abstract

Published Version

Talk to us

Similar Papers