Abstract

Raters represent a significant source of unexplained, and often undesired, variance in performance-based assessments. To better understand rater variance, this study investigated how various raters, observing the same performance, perceived relationships amongst different noncognitive attributes measured in performance assessments. Medical admissions data from a Multiple Mini-Interview (MMI) used at one Canadian medical school were collected and subsequently analyzed using the Many Facet Rasch Model (MFRM) and hierarchical clustering. This particular MMI consisted of eight stations. At each station a faculty member and an upper-year medical student rated applicants on various noncognitive attributes including communication, critical thinking, effectiveness, empathy, integrity, maturity, professionalism, and resolution. The Rasch analyses revealed differences between faculty and student raters across the eight different MMI stations. These analyses also identified that, at times, raters were unable to distinguish between the various noncognitive attributes. Hierarchical clustering highlighted differences in how faculty and student raters observed the various noncognitive attributes. Differences in how individual raters associated the various attributes within a station were also observed. The MFRM and hierarchical clustering helped to explain some of the variability associated with raters in a way that other measurement models are unable to capture. These findings highlight that differences in ratings may result from raters possessing different interpretations of an observed performance. This study has implications for developing more purposeful rater selection and rater profiling in performance-based assessments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call