Abstract

Theory of mind (ToM), understanding others’ mental states, is a defining skill belonging to humans. Research assessing LLMs’ ToM performance yields conflicting findings and leads to discussions about whether and how they could show ToM understanding. Psychological research indicates that the characteristics of a specific language can influence how mental states are represented and communicated. Thus, it is reasonable to expect language characteristics to influence how LLMs communicate with humans, especially when the conversation involves references to mental states. This study examines how these characteristics affect LLMs’ ToM performance by evaluating GPT 3.5 and 4 performances in English and Turkish. Turkish provides an excellent contrast to English since Turkish has a different syntactic structure and special verbs, san- and zannet-, meaning “falsely believe.” Using Open AI's Chat Completion API, we collected responses from GPT models for first- and second-order ToM scenarios in English and Turkish. Our innovative approach combined completion prompts and open-ended questions within the same chat session, offering deep insights into models’ reasoning processes. Our data showed that while GPT models can respond accurately to standard ToM tasks (100% accuracy), their performance deteriorates (below chance level) with slight modifications. This high sensitivity suggests a lack of robustness in ToM performance. GPT 4 outperformed its predecessor, GPT 3.5, showing improvement in ToM performance to some extent. The models generally performed better when tasks were presented in English than in Turkish. These findings indicate that GPT models cannot reliably pass first-order and second-order ToM tasks in either of the languages yet. The findings have significant implications for Explainability of LLMs by highlighting challenges and biases that they face when simulating human-like ToM understanding in different languages.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.