To compare student performance, examiner perceptions and cost of GPT-assisted (generative pretrained transformer-assisted) clinical and professional skills assessment (CPSAs) items against items created using standard methods. We conducted a prospective, controlled, double-blinded comparison of CPSA items developed using GPT-assistance with those created through standard methods. Two sets of six practical cases were developed for a formative assessment sat by final year medical students. One clinical case in each set was created with GPT-assistance. Students were assigned to one of the two sets. The results of 239 participants were analysed in the study. There was no statistically significant difference in item difficulty, or discriminative ability between GPT-assisted and standard items. One hundred percent (n=15) of respondents to an examiner feedback questionnaire felt GPT-assisted cases were appropriately difficult and realistic. GPT-assistance resulted in significant labour cost savings, with a mean reduction of 57% (880 GBP) in labour cost per case when compared to standard case drafting methods. GPT-assistance can create CPSA items of comparable quality with significantly less cost when compared to standard methods. Future studies could evaluate GPT's ability to create CPSA material in other areas of clinical practice, aiming to validate the generalisability of these findings.
Read full abstract