Abstract

Researchers have tested L2 pragmatics using at least six types of instruments. This study focuses on four of them: (a) Written Discourse Completion Tasks, (b) Oral Discourse Completion Tasks, (c) Discourse Role-Play Tasks (DRPT), and (d) Role-Play Self-Assessments. Students learning Korean ( n = 53) took these tests. Generalizability theory was used to examine the effects of different numbers of raters, item functions, item types, item characteristics on the dependability (analogous to reliability) of these tests. In addition, multifaceted Rasch (FACETS) analyses were used to investigate: (a) the relative severity or difficulty (on a common logit scale) of individual raters, item functions, item types, and item characteristics and (b) the degree to which the five-point scale was working well on each test. Based on these analyses, recommendations are made for deciding how to design each of the four types of tests in terms of the numbers of different raters, item functions, item types, and item characteristics that should be used to maximize test dependability in light of whatever practical considerations language teachers, testers, administrators, or researchers may deem important in using L2 pragmatics tests for actual decision making. The relative importance of item types, functions, and testing methods are also considered from a theoretical perspective.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call