Abstract

ObjectivesThe development of artificial intelligence-powered language models, such as Chatbot Generative Pre-trained Transformer (ChatGPT) or Large Language Model Meta AI (Llama), is emerging in medicine. Patients and practitioners have full access to chatbots that may provide medical information. The aim of this study was to explore the performance and accuracy of ChatGPT and Llama in treatment decision making for bilateral vocal fold paralysis (BVFP). MethodsData of twenty clinical cases, treated between 2018-2023, were retrospectively collected from four tertiary laryngology centres in Europe. The cases were defined as the most common or most challenging scenarios regarding BVFP treatment. The treatment proposals were discussed in their local multidisciplinary teams (MDT). Each case was presented to ChatGPT-4.0 and Llama Chat-2.0, and potential treatment strategies were requested. The Artificial Intelligence Performance Instrument (AIPI) treatment sub score was used to compare both Chatbots’ performances to MDT treatment proposal. ResultsMost common aetiology of BVFP was thyroid surgery. A form of partial arytenoidectomy with or without posterior transverse cordotomy was the MDT proposal for most cases. The accuracy of both Chatbots was very low regarding their treatment proposals, with a maximum AIPI treatment score in 5% of the cases. In most cases even harmful assertions were made, including the suggestion of vocal fold medialisation to treat patients with stridor and dyspnoea. ChatGPT-4.0 performed significantly better in suggesting the correct treatment as part of the treatment proposal (50%) compared to Llama Chat-2.0 (15%). ConclusionChatGPT and Llama are judged as inaccurate in proposing correct treatment for BVFP. ChatGPT significantly outperformed Llama. Treatment decision-making for a complex condition such as BVFP is clearly beyond the Chatbot’s knowledge expertise. This study highlights the complexity and heterogeneity of BVFP treatment, and the need for further guidelines dedicated to the management of BVFP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call