Below average ChatGPT performance in medical microbiology exam compared to university students

Malik Sallam,Khaled Al-Salahat

doi:10.3389/feduc.2023.1333415

Abstract

BackgroundThe transformative potential of artificial intelligence (AI) in higher education is evident, with conversational models like ChatGPT poised to reshape teaching and assessment methods. The rapid evolution of AI models requires a continuous evaluation. AI-based models can offer personalized learning experiences but raises accuracy concerns. MCQs are widely used for competency assessment. The aim of this study was to evaluate ChatGPT performance in medical microbiology MCQs compared to the students’ performance.MethodsThe study employed an 80-MCQ dataset from a 2021 medical microbiology exam at the University of Jordan Doctor of Dental Surgery (DDS) Medical Microbiology 2 course. The exam contained 40 midterm and 40 final MCQs, authored by a single instructor without copyright issues. The MCQs were categorized based on the revised Bloom’s Taxonomy into four categories: Remember, Understand, Analyze, or Evaluate. Metrics, including facility index and discriminative efficiency, were derived from 153 midterm and 154 final exam DDS student performances. ChatGPT 3.5 was used to answer questions, and responses were assessed for correctness and clarity by two independent raters.ResultsChatGPT 3.5 correctly answered 64 out of 80 medical microbiology MCQs (80%) but scored below the student average (80.5/100 vs. 86.21/100). Incorrect ChatGPT responses were more common in MCQs with longer choices (p = 0.025). ChatGPT 3.5 performance varied across cognitive domains: Remember (88.5% correct), Understand (82.4% correct), Analyze (75% correct), Evaluate (72% correct), with no statistically significant differences (p = 0.492). Correct ChatGPT responses received statistically significant higher average clarity and correctness scores compared to incorrect responses.ConclusionThe study findings emphasized the need for ongoing refinement and evaluation of ChatGPT performance. ChatGPT 3.5 showed the potential to correctly and clearly answer medical microbiology MCQs; nevertheless, its performance was below-bar compared to the students. Variability in ChatGPT performance in different cognitive domains should be considered in future studies. The study insights could contribute to the ongoing evaluation of the AI-based models’ role in educational assessment and to augment the traditional methods in higher education.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Education	Publication Date: Dec 21, 2023
Citations: 12	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Below average ChatGPT performance in medical microbiology exam compared to university students

Abstract

Talk to us

Similar Papers

More From: Frontiers in Education

Lead the way for us

Similar Papers

ChatGTP: What is it and how can nursing and health science education use it?
Mandy M Archibald ... Alexander M Clark
Journal of Advanced Nursing | VOL. 79
Mandy M Archibald, et. al.Mandy M Archibald ... Alexander M Clark
21 Mar 2023
Journal of Advanced Nursing | VOL. 79

Exploring the potential of artificial intelligence tools in educational measurement and assessment
Valentine Joseph Owan ... Bassey Asuquo Bassey
Eurasia Journal of Mathematics, Science and Technology Education | VOL. 19
Valentine Joseph Owan, et. al.Valentine Joseph Owan ... Bassey Asuquo Bassey
01 Aug 2023
Eurasia Journal of Mathematics, Science and Technology Education | VOL. 19

Accuracy and reliability of large language models in assessing learning outcomes achievement across cognitive domains.
Swapna Haresh Teckwani ... Ivan Cherh Chiet Low
Advances in physiology education | VOL. 48
Swapna Haresh Teckwani, et. al.Swapna Haresh Teckwani ... Ivan Cherh Chiet Low
01 Dec 2024
Advances in physiology education | VOL. 48

Digitalization, clinical microbiology and infectious diseases
A Egli
Clinical Microbiology and Infection | VOL. 26
A EgliA Egli
02 Jul 2020
Clinical Microbiology and Infection | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Below average ChatGPT performance in medical microbiology exam compared to university students

Abstract

Talk to us

Similar Papers

More From: Frontiers in Education