Abstract

Taiwan is well-known for its quality healthcare system. The country's medical licensing exams offer a way to evaluate ChatGPT's medical proficiency. We analyzed exam data from February 2022, July 2022, February 2023, and July 2033. Each exam included four papers with 80 single-choice questions, grouped as descriptive or picture-based. We used ChatGPT-4 for evaluation. Incorrect answers prompted a "chain of thought" approach. Accuracy rates were calculated as percentages. ChatGPT-4's accuracy in medical exams ranged from 63.75% to 93.75% (February 2022-July 2023). The highest accuracy (93.75%) was in February 2022's Medicine Exam (3). Subjects with the highest misanswered rates were ophthalmology (28.95%), breast surgery (27.27%), plastic surgery (26.67%), orthopedics (25.00%), and general surgery (24.59%). While using "chain of thought," the "Accuracy of (CoT) prompting" ranged from 0.00% to 88.89%, and the final overall accuracy rate ranged from 90% to 98%. ChatGPT-4 succeeded in Taiwan's medical licensing exams. With the "chain of thought" prompt, it improved accuracy to over 90%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call