Abstract

BackgroundIn testing contexts that are predominately concerned with power, rapid guessing (RG) has the potential to undermine the validity of inferences made from educational assessments, as such responses are unreflective of the knowledge, skills, and abilities assessed. Given this concern, practitioners/researchers have utilized a multitude of response time threshold procedures that classify RG responses in these contexts based on either the use of no empirical data (e.g., an arbitrary time limit), response time distributions, and the combination of response time and accuracy information. As there is little understanding of how these procedures compare to each other, this meta-analysis sought to investigate whether threshold typology is related to differences in descriptive, measurement property, and performance outcomes in these contexts.MethodsStudies were sampled that: (a) employed two or more response time (RT) threshold procedures to identify and exclude RG responses on the same computer-administered low-stakes power test; and (b) evaluated differences between procedures on the proportion of RG responses and responders, measurement properties, and test performance.ResultsBased on as many as 86 effect sizes, our findings indicated non-negligible differences between RT threshold procedures in the proportion of RG responses and responders. The largest differences for these outcomes were observed between procedures using no empirical data and those relying on response time and accuracy information. However, these differences were not related to variability in aggregate-level measurement properties and test performance.ConclusionsWhen filtering RG responses to improve inferences concerning item properties and group score outcomes, the actual threshold procedure chosen may be of less importance than the act of identifying such deleterious responses. However, given the conservative nature of RT thresholds that use no empirical data, practitioners may look to avoid the use of these procedures when making inferences at the individual-level, given their potential for underclassifying RG.

Highlights

  • In testing contexts that are predominately concerned with power, rapid guessing (RG) has the potential to undermine the validity of inferences made from educational assessments, as such responses are unreflective of the knowledge, skills, and abilities assessed

  • There are multiple forms of noneffortful responding, in the context of assessments concerned predominately with power, increased attention has been placed on rapid guessing (RG)

  • To fill the gap in the literature, the purpose of this paper is to conduct a meta-analysis of studies that compare two or more response time (RT) threshold procedures on the same empirical dataset obtained from computer-administration of a low-stakes power assessment

Read more

Summary

Introduction

In testing contexts that are predominately concerned with power, rapid guessing (RG) has the potential to undermine the validity of inferences made from educational assessments, as such responses are unreflective of the knowledge, skills, and abilities assessed. Assuming that examinees have been administered items in which they are capable of effortfully engaging (i.e., they have had an opportunity to learn the content assessed, they are proficient in the test language), RG can occur due to two factors: (a) time limit constraints (i.e., test speededness); and (b) low test-taking effort (Wise, 2017) Concerning the former, examinees may not have the time to fully engage in all test items, and may employ RG in an effort to increase their score (assuming no penalty for incorrect responses). This form of RG has been documented in high-stakes tests, in which the personal consequences for examinee performance is significant (see Schnipke & Scrams, 1997). Disengaged RG has been documented across a number of low-stakes (i.e., examinee performance his minimal to no personal consequences) assessments and a myriad of ages and cultures (see Rios, 2021a)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call