Abstract
IntroductionOngoing monitoring of cohort demographic variation is an essential part of quality assurance in medical education assessments, yet the methods employed to explore possible underlying causes of demographic variation in performance are limited. Focussing on properties of the vignette text in single-best-answer multiple-choice questions (MCQs), we explore here the viability of conducting analyses of text properties and their relationship to candidate performance. We suggest that such analyses could become routine parts of assessment evaluation and provide an additional, equality-based measure of an assessment’s quality and fairness.MethodsWe describe how a corpus of vignettes can be compiled, followed by examples of using Microsoft Word’s native readability statistics calculator and the koRpus text analysis package for the R statistical analysis environment for estimating the following properties of the question text: Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (Grade), word count, sentence count, and average words per sentence (WpS). We then provide examples of how these properties can be combined with equality and diversity variables, and the process automated to provide ongoing monitoring.ConclusionsGiven the monitoring of demographic differences in assessment for assurance of equality, the ability to easily include textual analysis of question vignettes provides a useful tool for exploring possible causes of demographic variations in performance where they occur. It also provides another means of evaluating assessment quality and fairness with respect to demographic characteristics. Microsoft Word provides data comparable to the specialized koRpus package, suggesting routine use of word processing software for writing items and assessing their properties is viable with minimal burden, but that automation for ongoing monitoring also provides an additional means of standardizing MCQ assessment items, and eliminating or controlling textual variables as a possible contributor to differential attainment between subgroups.
Highlights
Ongoing monitoring of cohort demographic variation is an essential part of quality assurance in medical education assessments, yet the methods employed to explore possible underlying causes of demographic variation in performance are limited
We provide an example of estimating the Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (Grade), word count, sentence count, and average words per sentence (WpS), and describe how these properties can be combined with equality and diversity variables
Given the monitoring of demographic differences in assessment for assurance of quality and fairness, the ability to include analysis of question text complexity may be useful in exploring possible causes of differences if and when they occur
Summary
Ongoing monitoring of cohort demographic variation is an essential part of quality assurance in medical education assessments, yet the methods employed to explore possible underlying causes of demographic variation in performance are limited. But not exclusively, equality monitoring centres on gender, ethnicity, and reported disability and evaluation of differential attainment between subgroups within these characteristics in a given assessment Is it important to ensure assessments do not discriminate as a property of these characteristics, it is important from an educational point of view to ensure that differences in performance between candidates are unbiased by these demographic variables. If they are, the assessment does not reflect the candidates’ true ability, but rather their ability plus or minus the effect of the interaction between the assessment tool and demography. This is a position realized and supported by professional bodies such as the British Medical Association and the American Medical Association [5, 6], who require ongoing monitoring of cohort and demographic variation
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have