Classification accuracy in Key Stage 2 National Curriculum tests in England

Qingping He,Malcolm Hayes,Dylan Wiliam

doi:10.1080/02671522.2012.754225

Abstract

The accuracy of the results of the national tests in English, mathematics and science taken by 11-year olds in England has been a matter of much debate since their introduction in 1994, with estimates of the proportion of students incorrectly classified varying from 10 to 30%. Using live data from the 2009 and 2010 administration of the national tests, this paper uses a number of models, drawing on both classical and modern test theories, to explore the relationship between test reliability, and the extent of misclassification when a student’s test score is reported in terms of one of a small number of discrete levels of achievement. The results indicate that across the two cohorts (2009 and 2010) and six models, the averages of classification accuracy of the tests were about 85%, 90% and 87% in English, mathematics and science, respectively. Moreover, the different models yielded very similar results; the standard deviations of the values of classification accuracy generated were 1.9% for English, 1.0% for mathematics and 1.3% for science.

Full Text