Abstract

ObjectivePROMIS offers computerized adaptive tests (CAT) of patient-reported outcomes, using a single set of US-based IRT item parameters across populations and language-versions. The use of country-specific item parameters has local appeal, but also disadvantages. We illustrate the effects of choosing US or country-specific item parameters on PROMIS CAT T-scores. Study design and settingSimulations were performed on response data from Dutch chronic pain patients (n = 1110) who completed the PROMIS Pain Behavior item bank. We compared CAT T-scores obtained with (1) US parameters; (2) Dutch item parameters; (3) US item parameters for DIF-free items and Dutch item parameters (rescaled to the US metric) for DIF items; (4) Dutch item parameters for all items (rescaled to the US metric). ResultsWithout anchoring to a common metric, CAT T-scores cannot be compared. When scores were rescaled to the US metric, mean differences in CAT T-scores based on US vs. Dutch item parameters were negligible. However, 0.9%–4.3% of the T-score differences were larger than 5 points (0.5 SD). ConclusionThe choice of item parameters can be consequential for individual patient scores. We recommend more studies of translated CATs to examine if strategies that allow for country-specific item parameters should be further investigated.

Highlights

  • Item response theory (IRT) is increasingly used to create item banks as the basis for computerized adaptive testing (CAT) for measuring patient-reported outcomes (PROs) [1,2,3,4,5]

  • When scores were rescaled to the US metric, mean differences in CAT T-scores based on US vs. Dutch item parameters were negligible

  • 0.9%–4.3% of the T-score differences were larger than 5 points (0.5 standard deviation (SD))

Read more

Summary

Introduction

Item response theory (IRT) is increasingly used to create item banks as the basis for computerized adaptive testing (CAT) for measuring patient-reported outcomes (PROs) [1,2,3,4,5]. The Patient-Reported Outcomes Measurement Information System (PROMIS) is the largest system of PRO item banks administered as CATs [9,10,11,12]. The default PROMIS convention is to use a single set of IRT item parameters across populations and language-versions to express scores on a common scale (T-score metric), unless evidence shows that this is problematic, eg, if items function substantially different across populations or language-versions [9,13]. A method adopted from the equating and linking literature, called Stocking-Lord method, was used for this purpose [25,26,27]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call