Abstract

Nedelsky (1954) and Angoff (1971) have sug gested procedures for establishing a cutting score based on raters' judgments about the likely perfor mance of minimally competent examinees on each item in a test. In this paper generalizability theory is used to characterize and quantify expected vari ance in cutting scores resulting from each proce dure. Experimental test data are used to illustrate this approach and to compare the two procedures. Consideration is also given to the impact of rater disagreement on some issues of measurement relia bility or dependability. Results suggest that the dif ferences between the Nedel sky and Angoff proce dures may be of greater consequence than their ap parent similarities. In particular, the restricted na ture of the Nedelsky (inferred) probability scale may constitute a basis for seriously questioning the ap plicability of this procedure in certain contexts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call