Abstract
Abstract: To monitor students’ progress and adapt instruction to students’ needs, teachers increasingly use repeated assessments of equivalent tests. The present study investigates whether equivalent reading tests can be successfully developed via rule-based item design. Based on theoretical considerations, we identified 3-item features for reading comprehension at the word, sentence, and text levels, respectively, which should influence the difficulty and time intensity of reading processes. Using optimal design algorithms, a design matrix was calculated, and four equivalent test forms of the German reading test series for second graders (quop-L2) were developed. A total of N = 7,751 students completed the tests. We estimated item difficulty and time intensity parameters as well as person ability and speed parameters using bivariate item response theory (IRT) models, and we investigated the influence of item features on item parameters. Results indicate that all item properties significantly affected either item difficulty or response time. Moreover, as indicated by the IRT-based test information functions and analyses of variance, the four different test forms showed similar levels of difficulty and time-intensity at the word, sentence, and text levels (all η2< .002). Results were successfully cross-validated using a sample of N = 5,654 students.
Highlights
Reading skills are one of the most fascinating and powerful achievements of human development, and they are critical to successfully participate in today’s society (Lenhard, 2019)
That the rank order of passages varies depending on the readability index used (Good & Kaminski, 2002), that the readability formulas only show low, if any, correlations with students’ WCPM scores (Ardoin et al, 2005), and that they are inferior to procedures in which the selection of passages is based on field testing and evaluation of student performance (Christ & Ardoin, 2009)
We evaluated the equivalence of test forms by comparing the test information function (TIF), by correlations between pairs of test forms on item difficulty and time intensity, and by an analysis of variance (ANOVA) comparing mean item difficulties and time intensities across test forms
Summary
Reading skills are one of the most fascinating and powerful achievements of human development, and they are critical to successfully participate in today’s society (Lenhard, 2019). Using rule-based item design has several advantages (see Holling et al, 2009), one of the most important of which is that once the essential cognitive components that influence item difficulties are known, one can generate and apply items on this empirical basis without needing to calibrate every single item. This provides the possibility of creating an arbitrary number of equivalent test forms, which is beneficial when constructing tests for progress monitoring assessments
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.