Comparison of integrated testlet and constructed-response question formats

Aaron D Slepkov,Ralph C Shiell

doi:10.1103/physrevstper.10.020120

Aaron D Slepkov, Ralph C Shiell

Open Access

https://doi.org/10.1103/physrevstper.10.020120

Copy DOI

Abstract

Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently-developed question structure designed to provide a proxy of the pedagogical advantages of CR questions while procedurally functioning as set of MC questions. ITs utilize an answer-until-correct response format that provides immediate confirmatory or corrective feedback, and they thus allow not only for the granting of partial credit in cases of initially incorrect reasoning, but furthermore the ability to build cumulative question structures. Here, we report on a study that directly compares the functionality of ITs and CR questions in introductory physics exams. To do this, CR questions were converted to concept-equivalent ITs, and both sets of questions were deployed in midterm and final exams. We find that both question types provide adequate discrimination between stronger and weaker students, with CR questions discriminating slightly better than the ITs. Meanwhile, an analysis of inter-rater scoring of the CR questions raises serious concerns about the reliability of the granting of partial credit when this traditional assessment technique is used in a realistic (but non optimized) setting. Furthermore, we show evidence that partial credit is granted in a valid manner in the ITs. Thus, together with consideration of the vastly reduced costs of administering IT-based examinations compared to CR-based examinations, our findings indicate that ITs are viable replacements for CR questions in formal examinations where it is desirable to both assess concept integration and to reward partial knowledge, while efficiently scoring examinations.

Highlights

Constructed-response (CR) questions are a mainstay of introductory physics textbooks and examinations
The difference in scores between MC and CR items may be attributed to several factors: the added opportunities for guessing available in MC testing; cuing effects resulting from the presence of the correct answer among the MC options; and within our integrated testlet (IT), the fact that feedback provided to students using the Immediate Feedback Assessment Technique (IF-AT) may enhance performance on subsequent items
The recent development of integrated testlets—a group of interdependent MC items that share a stem and which are administered with an answer-untilcorrect response protocol—has been described as a possible replacement for CR format questions in large classroom assessments [9]

Summary

Introduction

Constructed-response (CR) questions are a mainstay of introductory physics textbooks and examinations. Often called “problems,” these questions require the student to generate an acceptable response by demonstrating their integration of a wide and often complex set of skills and concepts. An expert must interpret the response and gauge its level of “correctness.” in multiple-choice (MC) testing, response options are provided within the question, with the correct answer (the keyed option) listed along with several incorrect answers (the distractors); the student’s task is to select the correct answer. Because response interpretation is not required in scoring MC items, scoring is quicker, cheaper, and more reliable [1,2,3], and these factors contribute to the increasing use of MC questions in introductory physics exams [1,4,5]

Objectives

Methods

Results

Conclusion