Evaluating the Influence of Role-Playing Prompts on ChatGPT's Misinformation Detection Accuracy: Quantitative Study.

Michael Robert Haupt,Luning Yang,Tina Purnat,Tim Mackey

doi:10.2196/60678

Michael Robert Haupt, Luning Yang + Show 2 more

Open Access

https://doi.org/10.2196/60678

Copy DOI

Export

Save

Cite

Journal: JMIR infodemiology	Publication Date: Sep 26, 2024
License type: cc-by

Abstract
Full-Text
Similar Papers

Abstract

Listen

During the COVID-19 pandemic, the rapid spread of misinformation on social media created significant public health challenges. Large language models (LLMs), pretrained on extensive textual data, have shown potential in detecting misinformation, but their performance can be influenced by factors such as prompt engineering (ie, modifying LLM requests to assess changes in output). One form of prompt engineering is role-playing, where, upon request, OpenAI's ChatGPT imitates specific social roles or identities. This research examines how ChatGPT's accuracy in detecting COVID-19-related misinformation is affected when it is assigned social identities in the request prompt. Understanding how LLMs respond to different identity cues can inform messaging campaigns, ensuring effective use in public health communications. This study investigates the impact of role-playing prompts on ChatGPT's accuracy in detecting misinformation. This study also assesses differences in performance when misinformation is explicitly stated versus implied, based on contextual knowledge, and examines the reasoning given by ChatGPT for classification decisions. Overall, 36 real-world tweets about COVID-19 collected in September 2021 were categorized into misinformation, sentiment (opinions aligned vs unaligned with public health guidelines), corrections, and neutral reporting. ChatGPT was tested with prompts incorporating different combinations of multiple social identities (ie, political beliefs, education levels, locality, religiosity, and personality traits), resulting in 51,840 runs. Two control conditions were used to compare results: prompts with no identities and those including only political identity. The findings reveal that including social identities in prompts reduces average detection accuracy, with a notable drop from 68.1% (SD 41.2%; no identities) to 29.3% (SD 31.6%; all identities included). Prompts with only political identity resulted in the lowest accuracy (19.2%, SD 29.2%). ChatGPT was also able to distinguish between sentiments expressing opinions not aligned with public health guidelines from misinformation making declarative statements. There were no consistent differences in performance between explicit and implicit misinformation requiring contextual knowledge. While the findings show that the inclusion of identities decreased detection accuracy, it remains uncertain whether ChatGPT adopts views aligned with social identities: when assigned a conservative identity, ChatGPT identified misinformation with nearly the same accuracy as it did when assigned a liberal identity. While political identity was mentioned most frequently in ChatGPT's explanations for its classification decisions, the rationales for classifications were inconsistent across study conditions, and contradictory explanations were provided in some instances. These results indicate that ChatGPT's ability to classify misinformation is negatively impacted when role-playing social identities, highlighting the complexity of integrating human biases and perspectives in LLMs. This points to the need for human oversight in the use of LLMs for misinformation detection. Further research is needed to understand how LLMs weigh social identities in prompt-based tasks and explore their application in different cultural contexts.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Evaluating the Influence of Role-Playing Prompts on ChatGPT's Misinformation Detection Accuracy: Quantitative Study.

Abstract

Published Version

Talk to us

Similar Papers

More From: JMIR infodemiology

Lead the way for us

Similar Papers

Large Language Models Can Enable Inductive Thematic Analysis of a Social Media Corpus in a Single Prompt: Human Validation Study.
Michael S Deiner ... Urmimala Sarkar
JMIR infodemiology | VOL. 4
Michael S Deiner, et. al.Michael S Deiner ... Urmimala Sarkar
29 Aug 2024
JMIR infodemiology | VOL. 4

Explaining Misinformation Detection Using Large Language Models
Vishnu S. Pendyala ... Christopher E. Hall
Electronics | VOL. 13
Vishnu S. Pendyala, et. al.Vishnu S. Pendyala ... Christopher E. Hall
26 Apr 2024
Electronics | VOL. 13

Personality testing of large language models: limited temporal stability, but highlighted prosociality.
Bojana Bodroža ... Ljubiša Bojić
Royal Society open science | VOL. 11
Bojana Bodroža, et. al.Bojana Bodroža ... Ljubiša Bojić
01 Oct 2024
Royal Society open science | VOL. 11

Abstract IA04: Evaluating and mitigating medical misinformation risk in large language models
Shan Chen ... Danielle D Bitterman
Clinical Cancer Research | VOL. 31
Shan Chen, et. al.Shan Chen ... Danielle D Bitterman
26 Jan 2025
Clinical Cancer Research | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Evaluating the Influence of Role-Playing Prompts on ChatGPT's Misinformation Detection Accuracy: Quantitative Study.

Abstract

Published Version

Talk to us

Similar Papers

More From: JMIR infodemiology