Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.

Ryan Chin Taw Cheong,Andrew Williamson,Kenny Peter Pang,Samit Unadkat,Venkata Mcneillis,Vinidh Paleri,Peter Andrews,Premjit Randhawa,Jonathan Joseph

doi:10.1007/s00405-023-08381-3

Abstract

To conduct a comparative performance evaluation of GPT-3.5, GPT-4 and Google Bard in self-assessment questions at the level of the American Sleep Medicine Certification Board Exam. A total of 301 text-based single-best-answer multiple choice questions with four answer options each, across 10 categories, were included in the study and transcribed as inputs for GPT-3.5, GPT-4 and Google Bard. The first output responses generated were selected and matched for answer accuracy against the gold-standard answer provided by the American Academy of Sleep Medicine for each question. A global score of 80% and above is required by human sleep medicine specialists to pass each exam category. GPT-4 successfully achieved the pass mark of 80% or above in five of the 10 exam categories, including the Normal Sleep and Variants Self-Assessment Exam (2021), Circadian Rhythm Sleep-Wake Disorders Self-Assessment Exam (2021), Insomnia Self-Assessment Exam (2022), Parasomnias Self-Assessment Exam (2022) and the Sleep-Related Movements Self-Assessment Exam (2023). GPT-4 demonstrated superior performance in all exam categories and achieved a higher overall score of 68.1% when compared against both GPT-3.5 (46.8%) and Google Bard (45.5%), which was statistically significant (p value < 0.001). There was no significant difference in the overall score performance between GPT-3.5 and Google Bard. Otolaryngologists and sleep medicine physicians have a crucial role through agile and robust research to ensure the next generation AI chatbots are built safely and responsibly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.

Abstract

Talk to us

Similar Papers

More From: European Archives of Oto-Rhino-Laryngology

Lead the way for us

Journal: European Archives of Oto-Rhino-Laryngology	Publication Date: Dec 20, 2023
Citations: 17

Similar Papers

Point: Should Board Certification Be Required for Sleep Test Interpretation? Yes
Sam A Fleishman ... Kathleen M Mccann
Chest | VOL. 144
Sam A Fleishman, et. al.Sam A Fleishman ... Kathleen M Mccann
01 Jul 2013
Chest | VOL. 144

Clinical Guidelines for the Use of Unattended Portable Monitors in the Diagnosis of Obstructive Sleep Apnea in Adult Patients
-
Journal of Clinical Sleep Medicine | VOL. 03
--
15 Dec 2007
Journal of Clinical Sleep Medicine | VOL. 03

Disruption in Health Care (and Sleep Medicine): "It's the End of the World as We Know it…and I Feel Fine."
Douglas B Kirsch
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 15
Douglas B KirschDouglas B Kirsch
15 Sep 2019
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 15

AASM takes the pulse of the sleep field and responds to COVID-19.
Kannan Ramar
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 16
Kannan RamarKannan Ramar
28 Sep 2020
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard.

Abstract

Talk to us

Similar Papers

More From: European Archives of Oto-Rhino-Laryngology