Abstract

To evaluate the effectiveness of ChatGPT 4 in generating multiple-choice questions (MCQs) with explanations for pathology board examinations, specifically for digestive system pathology. The customized ChatGPT 4 model was developed for MCQ and explanation generation. Expert pathologists evaluated content accuracy and relevance. These MCQs were then administered to pathology residents, followed by an analysis focusing on question difficulty, accuracy, item discrimination, and internal consistency. The customized ChatGPT 4 generated 80 MCQs covering various gastrointestinal and hepatobiliary topics. While the MCQs demonstrated moderate to high agreement in evaluation parameters such as content accuracy, clinical relevance, and overall quality, there were issues in cognitive level and distractor quality. The explanations were generally acceptable. Involving 9 residents with a median experience of 1 year, the average score was 57.4 (71.8%). Pairwise comparisons revealed a significant difference in performance between each year group (P <.01). The test analysis showed moderate difficulty, effective item discrimination (index=0.15), and good internal consistency (Cronbach's α=0.74). ChatGPT 4 demonstrated significant potential as a supplementary educational tool in medical education, especially in generating MCQs with explanations similar to those seen in board examinations. While artificial intelligence-generated content was of high quality, it necessitated refinement and expert review.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call