Abstract

ABSTRACTA basic consideration in large‐scale assessments that use constructed response (CR) items, such as essays, is how to allocate the essays to the raters that score them. Designs that are used in practice are incomplete, in that each essay is scored by only a subset of the raters, and also unbalanced, in that the number of essays scored by each rater differs across the raters. In addition, all of the possible rater pairs may not be used. The present study examines the effects of these factors on parameter recovery and classification accuracy using simulations of a latent class model based on signal detection theory (SDT). Many tests also include more than one CR item, which introduces a nested or hierarchical structure into the design, in that raters are nested within essays (i.e., there are multiple raters per essay) and essays are nested within examinees (i.e., each examinee provides two or more essays). A hierarchical rater model (HRM) has previously been developed to recognize the nested structure. A version of the HRM that incorporates a latent class signal detection model in the first level, referred to as the HRM‐SDT model, is presented. Parameter recovery in the HRM‐SDT model is examined in simulations. The model is applied to data from several ETS tests.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.