Abstract
This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub-specialized area of medicine, specifically practice exam questions in otolaryngology-head and neck surgery and assess its current efficacy for surgical trainees and learners. All available questions from a public, paid-access question bank were manually input through ChatGPT. Outputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations. Overall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy. Currently, artificial intelligence-driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub-specialty specific patient decision making scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.