Theory of mind performance of large language models: A comparative analysis of Turkish and English

Burcu Ünlütabak,Onur Bal

doi:10.1016/j.csl.2024.101698

Abstract

Theory of mind (ToM), understanding others’ mental states, is a defining skill belonging to humans. Research assessing LLMs’ ToM performance yields conflicting findings and leads to discussions about whether and how they could show ToM understanding. Psychological research indicates that the characteristics of a specific language can influence how mental states are represented and communicated. Thus, it is reasonable to expect language characteristics to influence how LLMs communicate with humans, especially when the conversation involves references to mental states. This study examines how these characteristics affect LLMs’ ToM performance by evaluating GPT 3.5 and 4 performances in English and Turkish. Turkish provides an excellent contrast to English since Turkish has a different syntactic structure and special verbs, san- and zannet-, meaning “falsely believe.” Using Open AI's Chat Completion API, we collected responses from GPT models for first- and second-order ToM scenarios in English and Turkish. Our innovative approach combined completion prompts and open-ended questions within the same chat session, offering deep insights into models’ reasoning processes. Our data showed that while GPT models can respond accurately to standard ToM tasks (100% accuracy), their performance deteriorates (below chance level) with slight modifications. This high sensitivity suggests a lack of robustness in ToM performance. GPT 4 outperformed its predecessor, GPT 3.5, showing improvement in ToM performance to some extent. The models generally performed better when tasks were presented in English than in Turkish. These findings indicate that GPT models cannot reliably pass first-order and second-order ToM tasks in either of the languages yet. The findings have significant implications for Explainability of LLMs by highlighting challenges and biases that they face when simulating human-like ToM understanding in different languages.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Theory of mind performance of large language models: A comparative analysis of Turkish and English

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Similar Papers

Theory of mind after traumatic brain injury
Helen Bibby ... Skye Mcdonald
Neuropsychologia | VOL. 43
Helen Bibby, et. al.Helen Bibby ... Skye Mcdonald
20 Jul 2004
Neuropsychologia | VOL. 43

Theory of mind and cognitive processes in aging and Alzheimer type dementia: a systematic review
Mélanie Sandoz ... Marion Fossard
Aging & Mental Health | VOL. 18
Mélanie Sandoz, et. al.Mélanie Sandoz ... Marion Fossard
04 Apr 2014
Aging & Mental Health | VOL. 18

Plasticity in older adults’ theory of mind performance: the impact of motivation
Xin Zhang ... Tianyong Chen
Aging & Mental Health | VOL. 22
Xin Zhang, et. al.Xin Zhang ... Tianyong Chen
08 Sep 2017
Aging & Mental Health | VOL. 22

Examining mind-reading in the life span: from longitudinal to training studies
Serena Lecce
Frontiers in Psychology | VOL. 8
Serena LecceSerena Lecce
01 Jan 2018
Frontiers in Psychology | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Theory of mind performance of large language models: A comparative analysis of Turkish and English

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language