Abstract

Large-scale survey assessments have been used for decades to monitor what students know and can do. Such assessments aim at providing group-level scores for various populations, with little or no consequence to individual students for their test performance. Students' test-taking behaviors in survey assessments, particularly the level of test-taking effort, and their effects on performance have been a long-standing question. This paper presents a procedure to examine test-taking behaviors using response time collected from a National Assessment of Educational Progress (NAEP) computer-based study, referred to as MCBS. A five-step procedure was proposed to identify rapid-guessing behavior in a more systematic manner. It involves a non-model-based approach that classifies student-item pairs as reflecting either solution behavior or rapid-guessing behavior. Three validity checks were incorporated in the validation step to ensure reasonableness of the time boundaries before further investigation. Results of behavior classification were summarized by three measures to investigate whether and how students' test-taking behaviors related to student characteristics, item characteristics, or both. In the MCBS, the validity checks offered compelling evidence that the recommended threshold-identification method was effective in separating rapid-guessing behavior from solution behavior. A very low percent of rapid-guessing behavior was identified, as compared to existing results for different assessments. For this dataset, rapid-guessing behavior had minimum impact on parameter estimation in the IRT modeling. However, the students clearly exhibited different behaviors when they received items that did not match their performance level. We also found disagreement between students' response-time effort and self reports, but based on the observed data, it is unclear whether the disagreement was related to how the students interpreted the background questions. The paper provides a way to address the issue of identifying rapid-guessing behavior, and sheds light on the question about students' extent of engagement in NAEP and the impact, without relying on students' self evaluation or additional costs in test design. It reveals useful information about test-taking behaviors in a NAEP assessment setting that has not been available in the literature. The procedure is applicable to future standard NAEP assessments, as well as other tests, when timing data are available.

Highlights

  • Large-scale survey assessments have been used for decades to monitor what students know and can do

  • The Mathematics Computer-Based Study (MCBS) provided detailed timing data, which permits the investigation of student behaviors in terms of response time (RT), as the students move through the test. It helped answer a practical, long-term question—can we evaluate National Assessment of Educational Progress (NAEP) students’ extent of engagement when timing data are available? Because the MCBS had time limits at both stages of the test, test speededness should not be completely precluded, even though disengagement with the test is of primary concern in low-stakes assessments

  • The recommended five-step procedure in this paper aims to (a) strengthen the existing approaches to address the circumstance where rapid-guessing behavior is of concern and to be assessed, but not all items have clear bimodal RT distributions, and to (b) identify rapid-guessing behavior in a more systematic manner

Read more

Summary

Introduction

Large-scale survey assessments have been used for decades to monitor what students know and can do. Large-scale national and international survey assessments, such as the National Assessment of Educational Progress (NAEP), the Programme for International Student Assessment (PISA), and the Trends in International Mathematics and Science Study (TIMSS), have been used for decades to monitor what students know and can do. Those survey assessments are often referred to as low-stakes assessments as they aim to provide group-level scores for various populations, and students taking the assessments receive no academic credit and bear little or no consequences for their test performance. To improve the quality of parameter estimation in measurement models, and the validity of group score estimates, one solution is to identify responses from individual students showing disengagement with the test/items and remove them from the analysis

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call