Abstract

Open-ended and multiple choice questions are commonly placed on the same tests; however, there is a discussion on the effects of using different item types on the test and item statistics. This study aims to compare model and item fit statistics in a mixed format test where multiple choice and constructed response items are used together. In this 25-item fourth-grade science test administered to 2351 students in 35 schools in Turkey, items are calibrated separately and concurrently utilizing different IRT models. An important aspect of this study is that the effect of the calibration method on model and item fit is investigated on real data. Firstly, while the 1-, 2-, and 3-Parameter Logistic models are utilized to calibrate the binary coded items, the Graded Response Model and the Generalized Partial Credit Model are used to calibrate the open-ended ones. Then, combinations of dichotomous and polytomous models are employed concurrently. The results based on model comparisons revealed that the combination of the 3PL and the Graded Response Model produced the best fit statistics.

Highlights

  • Tests play crucial roles in individuals’ lives

  • The current study focuses on item calibrations based on the 1, 2, and 3- Parameter Logistic Models (1PL, 2PL, 3PL), the Generalized Partial Credit Model (GPCM), and the Graded Response Model (GRM)

  • Orlando and Thissen’s (2000, 2003) S-X2 statistics were computed to evaluate item misfits throughout the study. This statistic was originally developed for dichotomous Item Response Theory (IRT) models and was found to perform better than the traditional item-fit statistics

Read more

Summary

Introduction

Tests play crucial roles in individuals’ lives. Exams are used for many reasons, such as selection and placement of individuals, determining which knowledge areas need to be improved, and planning and revising educational programs. As Lissitz, Hou and Slater (2014) stress, if MC items are exclusively used in testing, the focus of instruction and learning will undermine the analysis, synthesis and evaluation skills of the learners, which in turn risk the loss of the active construction of knowledge To eliminate these major limitations, it is possible to incorporate constructed response (CR) items in tests. Mixed format tests including both MC and CR items are highly effective measurement tools for teaching and learning to overcome the limitations stemming from their separate use When they are combined, more reliable content total scores are obtained and a more precise latent trait is defined (Sykes & Yen, 2000). As Hollingworth, Beard and Proctor (2007) state, some educators and policy makers believe that constructed response items and multiple choice items do not measure the same construct when placed on the same tests

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call