Abstract
The main objective of this study is to evaluate the ability of the Large Language Model ChatGPT to accurately answer USMLE board style medical ethics questions compared to medical knowledge based questions. This study has the additional objectives of comparing the overall accuracy of GPT-3.5 to GPT-4 and to assess the variability of responses given by each version. Using AMBOSS, a third party USMLE Step Exam test prep service, we selected one group of 27 medical ethics questions and a second group of 27 medical knowledge questions matched on question difficulty for medical students. We ran 30 trials asking these questions on GPT-3.5 and GPT-4, and recorded the output. A random-effects linear probability regression model evaluated accuracy, and a Shannon entropy calculation evaluated response variation. Both versions of ChatGPT demonstrated a worse performance on medical ethics questions compared to medical knowledge questions. GPT-4 performed 18% points (P < 0.05) worse on medical ethics questions compared to medical knowledge questions and GPT-3.5 performed 7% points (P = 0.41) worse. GPT-4 outperformed GPT-3.5 by 22% points (P < 0.001) on medical ethics and 33% points (P < 0.001) on medical knowledge. GPT-4 also exhibited an overall lower Shannon entropy for medical ethics and medical knowledge questions (0.21 and 0.11, respectively) than GPT-3.5 (0.59 and 0.55) which indicates lower variability in response. Both versions of ChatGPT performed more poorly on medical ethics questions compared to medical knowledge questions. GPT-4 significantly outperformed GPT-3.5 on overall accuracy and exhibited a significantly lower response variability in answer choices. This underscores the need for ongoing assessment of ChatGPT versions for medical education. ChatGPT, Large Language Model, Artificial Intelligence, Medical Education, USMLE, Ethics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.