Statistical Analysis of Bias in ChatGPT Using Prompt Engineering

Rishi Sinha

doi:10.22214/ijraset.2023.53885

Abstract

Abstract: ChatGPT is a leading Large Language Model trained on an extensive and diverse assortment of text data. However, the utilization of potentially biased training data from the internet corpora could lead to fundamental bias introduced in the model, which will subsequently reflect on its generated output. This paper quantifies bias present in GPT-3.0 model responses on various controversial topics using carefully engineered prompts. We measured raw bias in each generated response by leveraging the Bipartisan Press API. Using statistical methods such as the T-test and ANOVA on raw bias measurements, we tested our hypothesis. Our results demonstrate that there is statistically significant left leaning bias present in 9 out of the 11 controversial topics we tested. Further, ANOVA analysis shows that the bias present varies based on topics. We posit that our findings could be instrumental in guiding future efforts to mitigate training bias and address the larger alignment problem present in generative AI.

Full Text