Artificial intelligence (AI) has gained massive interest with the public release of the conversational AI "ChatGPT," but it also has become a matter of concern for academia as it can easily be misused. We performed a quantitative evaluation of the performance of ChatGPT on a medical physiology university examination. Forty-one answers were obtained with ChatGPT and compared to the results of 24 students. The results of ChatGPT were significantly better than those of the students; the median (IQR) score was 75% (66-84%) for the AI compared to 56% (43-65%) for students (P < 0.001). The exam success rate was 100% for ChatGPT, whereas 29% (n = 7) of students failed. ChatGPT could promote plagiarism and intellectual laziness among students and could represent a new and easy way to cheat, especially when evaluations are performed online. Considering that these powerful AI tools are now freely available, scholars should take great care to construct assessments that really evaluate student reflection skills and prevent AI-assisted cheating.NEW & NOTEWORTHY The release of the conversational artificial intelligence (AI) ChatGPT has become a matter of concern for academia as it can easily be misused by students for cheating purposes. We performed a quantitative evaluation of the performance of ChatGPT on a medical physiology university examination and observed that ChatGPT outperforms medical students obtaining significantly better grades. Scholars should therefore take great care to construct assessments crafted to really evaluate the student reflection skills and prevent AI-assisted cheating.
Read full abstract