Introduction Alongside the usual exam-level cut-score requirement, the use of a conjunctive minimum number of stations passed (MNSP) standard in OSCE-type assessments is common practice across some parts of the world. Typically, the MNSP is fixed in advance with little justification, and does not vary from one administration to another in a particular setting—which is not congruent to best assessment practice for high stakes examinations. In this paper, we investigate empirically four methods of setting such a standard in an examinee-centred (i.e. post hoc) and criterion-based way that allows the standard to vary appropriately with station and test difficulty. Methods and results Using many administrations (n = 442) from a single exam (PLAB2 in the UK), we show via mixed modelling that the total number of stations passed for each candidate has reliability close to that of the total test score (relative g-coefficient 0.73 and 0.76 respectively). We then argue that calculating the MNSP based on the predicted number of stations passed at the ‘main’ exam-level cut-score (i.e. for the borderline candidate) is conceptually, theoretically and practically preferred amongst the four approaches considered. Further analysis indicates that this standard does vary from administration to administration, but acts in a secondary way, with approximately a quarter of exam-level candidate failures resulting from application of the MNSP standard alone. Conclusion Collectively, this work suggests that employing the identified approach to setting the MNSP standard is practically possible and, in many settings, is more defensible than using a fixed number of stations set in advance.
Read full abstract