Scaling Implicit Bias Analysis across Transformer-Based Language Models through Embedding Association Test and Prompt Engineering

Ravi Varma Kumar Bevara,Sai Pranathi Karedla,Nishith Reddy Mannuru,Ting Xiao,Ting Xiao

doi:10.3390/app14083483

Ravi Varma Kumar Bevara, Sai Pranathi Karedla + Show 3 more

Open Access

https://doi.org/10.3390/app14083483

Copy DOI

Journal: Applied Sciences	Publication Date: Apr 20, 2024
License type: CC BY 4.0

Affiliation: University of North Texas

Abstract

In the evolving field of machine learning, deploying fair and transparent models remains a formidable challenge. This study builds on earlier research, demonstrating that neural architectures exhibit inherent biases by analyzing a broad spectrum of transformer-based language models from base to x-large configurations. This article investigates movie reviews for genre-based bias, which leverages the Word Embedding Association Test (WEAT), revealing that scaling models up tends to mitigate bias, with larger models showing up to a 29% reduction in prejudice. Alternatively, this study also underscores the effectiveness of prompt-based learning, a facet of prompt engineering, as a practical approach to bias mitigation, as this technique reduces genre bias in reviews by more than 37% on average. This suggests that the refinement of development practices should include the strategic use of prompts in shaping model outputs, highlighting the crucial role of ethical AI integration to weave fairness seamlessly into the core functionality of transformer models. Despite the basic nature of the prompts employed in this research, this highlights the possibility of embracing structured prompt engineering to create AI systems that are ethical, equitable, and more responsible for their actions.

Full Text