This study investigates the effectiveness of a proposed version of Meta’s LLaMA 3 model in detecting fake claims across bilingual (English and Romanian) datasets, focusing on a multi-class approach beyond traditional binary classifications in order to better mimic real-world scenarios. The research employs a proposed version of the LLaMA 3 model, optimized for identifying nuanced categories such as “Mostly True” and “Mostly False”, and compares its performance against leading large language models (LLMs) including Open AI’s ChatGPT versions, Google’s Gemini, and similar LLaMA models. The analysis reveals that the proposed LLaMA 3 model consistently outperforms its base version and older LLaMA models, particularly in the Romanian dataset, achieving the highest accuracy of 39% and demonstrating superior capabilities in identifying nuanced claims, over all the compared large language models. However, the model’s performance across both languages highlights some challenges, with generally low accuracy and difficulties in handling ambiguous categories by all the LLMs. The study also underscores the impact of language and cultural context on model reliability, noting that even state-of-the-art models like ChatGPT 4.o and Gemini exhibit inconsistencies when applied to Romanian text and more than a binary true/false approach.
Read full abstract