Insights into the Angoff method: results from a simulation study.

Boaz Shulruf,Jennifer Weller,Phillippa Poole,Philip Jones,Tim Wilkinson

doi:10.1186/s12909-016-0656-7

Boaz Shulruf, Jennifer Weller + Show 3 more

Open Access

https://doi.org/10.1186/s12909-016-0656-7

Copy DOI

Abstract

BackgroundIn standard setting techniques involving panels of judges, the attributes of judges may affect the cut-scores. This simulation study modelled the effect of the number of judges and test items, as well as the impact of judges’ attributes such as accuracy, stringency and influence on others on the precision of the cut-scores.MethodsForty nine combinations of Angoff panels (N = 5, 10, 15, 20, 30, 50, and 80) and test items (n = 5, 10, 15, 20, 30, 50, and 80) were simulated. Each combination was simulated 100 times (in total 4,900 simulations). The simulation was of judges attributes: stringency, accuracy and leadership. Impact of judges attributes, number of judges, number of test items and Angoff’s second (compared to the first) round on the precision of a panel’s cut-score was measured by the deviation of the panel’s cut-score from the cut-score’s true value.ResultsFindings from 4900 simulated panels supported Angoff being both reliable and valid. Unless the number of test items is small, panels of around 15 judges with mixed levels of expertise provide the most precise estimates. Furthermore, if test data were not presented, a second round of decision-making, as used in the modified Angoff, adds little to precision. A panel which has only experts or only non-experts yields a cut-score which is less precise than a cut-score yielded by a mixed-expertise panel, suggesting that optimal composition of an Angoff panel should include a range of judges with diverse expertise and stringency.ConclusionsSimulations aim to improve our understanding of the models assessed but they do not describe natural phenomena as they do not use observed data. While the simulations undertaken in this study help clarify how to set cut-scores defensibly, it is essential to confirm these theories in practice.

Highlights

In standard setting techniques involving panels of judges, the attributes of judges may affect the cut-scores
Standard setting is an important aspect of assessment, with the literature describing a plethora of methods
The results were derived from 4,900 simulated Angoff panels comprising 100 simulations of each of the 49 combinations of the number of judges and of items

Summary

Introduction

In standard setting techniques involving panels of judges, the attributes of judges may affect the cut-scores. In this process, each judge estimates the proportion of minimally competent examinees who would give a correct answer to each of the items. Each judge estimates the proportion of minimally competent examinees who would give a correct answer to each of the items Those estimates are Research on the utility of Angoff suggests that the cutscores generated by a panel are affected by the panel’s composition, the number of judges and their levels of expertise [7, 17, 32, 66, 70].

Methods

Results

Discussion

Conclusion