Using modern test theory to maintain standards in public qualifications in England

Christopher Wheadon

doi:10.1080/02671522.2012.706631

Abstract

This paper describes how item response theory (IRT) methods of test-equating could be applied to the maintenance of public examination standards in England. IRT methods of test-equating have been sparingly applied to the main public examinations in England, namely the General Certificate of Secondary Education (GCSE), the equivalent of a school leaving examination, taken at age 16, and A-levels, taken at age 18 prior to university entrance. The lack of application of test-equating may be because such methods are thought to be irrelevant or surplus to requirements or because the IRT models that were originally considered in this context were simple and based on rigid assumptions that would not hold in the case of the GCSE and A-level. This paper illustrates that current methods used for the maintenance of standards lack any reliable measure of performance standard, and explores some developments in modern IRT methods that seek to overcome the restrictions of early IRT models and equating designs. Specifically, it reports on a post-equating study that attempts to link performance standards between a January and a June GCSE test session. The linking is done in a Bayesian IRT framework as well as a marginal maximal likelihood framework using the one parameter logistic model. Results from various equating methods and IRT models are discussed and compared to the results derived from the standard methods used in England. It concludes that test-equating studies can be used to derive fairly accurate estimations of performance standards that are only crudely available from other indicators currently used in the system.

Full Text