Abstract

Abstract Background The increasing application of artificial intelligence (AI) in healthcare offers significant opportunities for improving clinical decision-making and medical education. This study assesses the performance of ChatGPT, a large language model (LLM), on the Italian State Exam for Medical Residency (SSM) test, to evaluate its potential as a tool for medical education and clinical decision-making support. Given the high stakes of medical decision-making, understanding the capabilities of AI tools like ChatGPT in this context is crucial. Methods A cross-sectional study design was employed, analyzing 136 questions from the official SSM test. ChatGPT's responses were compared to the performance of medical doctors who took the test in 2022. Questions were classified into clinical cases (CC) and notional questions (NQ). The study assessed ChatGPT's overall accuracy, performance on CC and NQ, and compared its results with participating medical doctors. Results ChatGPT achieved an overall accuracy of 90.44%, with higher performance on clinical cases (92.45%) compared to notional questions (89.15%). ChatGPT's performance was higher than 99.6% of the medical doctor participants. Conclusions Based on the data, ChatGPT shows promise as a useful tool in clinical decision-making, particularly in clinical reasoning. This study recommends further research to explore the potential applications and implementation of LLMs in medical education and medical practice, as well as the initiation of procedures and policies for their safe and effective integration into healthcare systems. Key messages • ChatGPT outperforms 99.6% of medical doctors in the Italian State Exam for Medical Residency, demonstrating potential as a clinical decision-making tool. • Further research is needed to explore LLM implementation in medical education and practice, and to develop policies for safe AI integration.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call