This study investigates the validity of assessing L2 pragmatics in interaction using mixed methods, focusing on the evaluation inference. Open role-plays that are meaningful and relevant to the stakeholders in an English for Academic Purposes context were developed for classroom assessment. For meaningful score interpretations and accurate evaluations of interaction-involved pragmatic performances, interaction-sensitive data-driven rating criteria were developed, based on the qualitative analyses of examinees’ role-play performances. The conversation analysis performed on the data revealed various pragmatic and interactional features indicative of differing levels of pragmatic competence in interaction. The FACETS analysis indicated that the role-plays stably differentiated between the varying degrees of the 102 examinees’ pragmatic abilities. The raters showed internal consistency despite their differing degrees of severity. Stable fit statistics and distinct difficulties were reported within each of the interaction-sensitive rating criteria.The findings served as backing for the evaluation inference in the validity argument. Finally, implications of the findings in operationalizing interaction-involved language performances and developing rating criteria are discussed.
Read full abstract