Abstract

Artificial intelligence chatbots, like ChatGPT, have become powerful tools that are disrupting how humans interact with technology. The potential uses within medicine are vast. In medical education, these chatbots have shown improvements, in a short time span, in generalized medical examinations. We evaluated the overall performance and improvement between ChatGPT 3.5 and 4.0 in a test of pediatric cardiology knowledge. ChatGPT 3.5 and ChatGPT 4.0 were used to answer text-based multiple-choice questions derived from a Pediatric Cardiology Board Review textbook. Each chatbot was given an 88 question test, subcategorized into 11 topics. We excluded questions with modalities other than text (sound clips or images). Statistical analysis was done using an unpaired two-tailed t-test. Of the same 88 questions, ChatGPT 4.0 answered 66% of the questions correctly (n = 58/88) which was significantly greater (p < 0.0001) than ChatGPT 3.5, which only answered 38% (33/88). The ChatGPT 4.0 version also did better on each subspeciality topic as compared to ChatGPT 3.5. While acknowledging that ChatGPT does not yet offer subspecialty level knowledge in pediatric cardiology, the performance in pediatric cardiology educational assessments showed a considerable improvement in a short period of time between ChatGPT 3.5 and 4.0.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call