Advancing Medical Education: Performance of Generative Artificial Intelligence Models on Otolaryngology Board Preparation Questions With Image Analysis Insights.

Emma Terwilliger,George Bcharah,Clare Richardson,Estefana Bcharah,Patrick Scheffler,Hend Bcharah

doi:10.7759/cureus.64204

Abstract

Objective To evaluate and compare the performance of Chat Generative Pre-Trained Transformer (ChatGPT), GPT-4, and Google Bard on United States otolaryngology board-style questions to scale their ability to act as an adjunctive study tool and resource for students and doctors. Methods A 1077 text question and 60 image-based questions from the otolaryngology board exam preparation tool BoardVitals were inputted into ChatGPT, GPT-4, and Google Bard. The questions were scaled true or false, depending on whether the artificial intelligence (AI) modality provided the correct response. Data analysis was performed in R Studio. Results GPT-4 scored the highest at 78.7% compared to ChatGPT and Bard at 55.3% and 61.7% (p<0.001), respectively. In terms of question difficulty, all three AI models performed best on easy questions (ChatGPT: 69.7%, GPT-4: 92.5%, and Bard: 76.4%) and worst on hard questions (ChatGPT: 42.3%, GPT-4: 61.3%, and Bard: 45.6%). Across all difficulty levels, GPT-4 did better than Bard and ChatGPT (p<0.0001). GPT-4 outperformed ChatGPT and Bard in all subspecialty sections, with significantly higher scores (p<0.05) on all sections except allergy (p>0.05). On image-based questions, GPT-4 performed better than Bard (56.7% vs 46.4%, p=0.368) and had better overall image interpretation capabilities. Conclusion This study showed that the GPT-4 model performed better than both ChatGPT and Bard on the United States otolaryngology board practice questions. Although the GPT-4 results were promising, AI should still be used with caution when being implemented in medical education or patient care settings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Cureus	Publication Date: Jul 9, 2024
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Advancing Medical Education: Performance of Generative Artificial Intelligence Models on Otolaryngology Board Preparation Questions With Image Analysis Insights.

Abstract

Talk to us

Similar Papers

More From: Cureus

Lead the way for us

Similar Papers

Comparison of Gemini Advanced and ChatGPT 4.0's Performances on the Ophthalmology Resident Ophthalmic Knowledge Assessment Program (OKAP) Examination Review Question Banks.
Gurnoor S Gill ... Jillene Moxam
Cureus | VOL. 16
Gurnoor S Gill, et. al.Gurnoor S Gill ... Jillene Moxam
17 Sep 2024
Cureus | VOL. 16

Whither medical education in the United States?
Nicholas H Fiebach ... David E Kern
Journal of general internal medicine | VOL. 18
Nicholas H Fiebach, et. al.Nicholas H Fiebach ... David E Kern
01 May 2003
Journal of general internal medicine | VOL. 18

Развитие последипломного медицинского образования в Украине
Yu Voronenko ... O Shekera
Health of Society | VOL. 9
Yu Voronenko, et. al.Yu Voronenko ... O Shekera
08 Nov 2021
Health of Society | VOL. 9

Treatment of Serious Mental Illness in Medical and Mental Health Settings.
Scott Wetzler ... Nathaniel Counts
Psychiatric Services | VOL. 71
Scott Wetzler, et. al.Scott Wetzler ... Nathaniel Counts
23 Apr 2020
Psychiatric Services | VOL. 71

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Advancing Medical Education: Performance of Generative Artificial Intelligence Models on Otolaryngology Board Preparation Questions With Image Analysis Insights.

Abstract

Talk to us

Similar Papers

More From: Cureus