Chat GPT Develops Multiple Choice Questions (MCQs) for Postgraduate Specialty Assessment – A Reality or a Myth?

Faridah Amir Ali Faridah Amir Ali,Salman Yousuf Sharif,Madiha Ata Madiha Ata,Nirali Patel Nirali Patel,Muhammad Rafay Muhammad Rafay,Hasan Raza Syed Hasan Raza Syed,Saima Perwaiz Iqbal Saima Perwaiz Iqbal

doi:10.36552/pjns.v28i1.963

Abstract

Objective: Multiple Choice Questions (MCQs) are a valuable assessment tool, but creating them to match learning goals needs experts. AI, like ChatGPT, might offer an alternative. A study showed MCQs made for medical programs by ChatGPT and the faculty. This study compares faculty-made MCQs to ChatGPT-made ones for a post-grad program. Material & Methods: Specific learning objectives of a module from a medical and surgical program were extracted. One mid-level faculty and the AI software developed MCQ from each learning objective with a clinical scenario. Two subject and medical education experts from each specialty were blinded and given a standardized online tool to rate the technical and content quality of the MCQs in five domains; the item, vignette, question stem, response options, and overall quality. Results: For the medicine and allied specialty, 23 MCQs in each set were assessed. There was no significant difference between each variable, the overall quality of MCQs, or the odds of the decision to accept the questionnaire. Two sets of 24 MCQs were assessed for the surgical and allied specialty. There was no difference between the domains for “Item” and “Vignette”. For the domain “question stem”, MCQs developed by faculty were more grammatically correct (p-value 0.02). There was no difference in the quality or odds of the decision to accept. Conclusions: AI's impact on education is undeniable. Our findings indicate that in specific areas, faculty outperformed ChatGPT, though overall question quality was comparable. More research is necessary, but ChatGPT could potentially streamline assessment development, saving faculty substantial time.

Full Text