AbstractIn operational testing, item response theory (IRT) models for dichotomous responses are popular for measuring a single latent construct , such as cognitive ability in a content domain. Estimates of , also called IRT scores or , can be computed using estimators based on the likelihood function, such as maximum likelihood (ML), weighted likelihood (WL), maximum a posteriori (MAP), and expected a posteriori (EAP). Although the parameter space of is theoretically unrestricted, the range of finite is constrained by the estimator and test form properties, which is important to consider but often overlooked when developing a score scale for reporting purposes. Irrespective of the estimator or test forms at hand, a common practice is to fix arbitrary points symmetric about zero (e.g., −4 and 4) as anchors for deriving a score transformation, possibly resulting in unintended gaps or truncations at the extremes. Therefore, a systematic framework is proposed for using IRT scores to construct a robust score scale with informed boundaries that are logical and consistent across test forms.
Read full abstract