Abstract

ObjectiveTo describe an evaluation of a generative language model tool to write examination questions for a new elective course focused on the interpretation of common clinical laboratory results being developed as an elective for students in a Bachelor of Science in Pharmaceutical Sciences program. MethodsA total of 100 multiple-choice questions were generated using a publicly available large language model for a course dealing with common laboratory values. Two independent evaluators with extensive training and experience in writing multiple-choice questions evaluated each question for appropriate formatting, clarity, correctness, relevancy, and difficulty. For each question, a final dichotomous judgment was assigned by each reviewer, usable as written or not usable written. ResultsThe major finding of this study was that a generative language model (ChatGPT 3.5) could generate multiple-choice questions for assessing common laboratory value information but only about half the questions (50% and 57% for the 2 evaluators) were deemed usable without modification. General agreement between evaluator comments was common (62% of comments) with more than 1 correct answer being the most common reason for commenting on the lack of usability (N = 27). ConclusionThe generally positive findings of this study suggest that the use of a generative language model tool for developing examination questions is deserving of further investigation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call