Abstract

BackgroundThis paper is part of a series comparing different psychometric approaches to evaluate patient-reported outcome (PRO) measures using the same items and dataset. We provide an overview and example application to demonstrate 1) using item response theory (IRT) to identify poor and well performing items; 2) testing if items perform differently based on demographic characteristics (differential item functioning, DIF); and 3) balancing IRT and content validity considerations to select items for short forms.MethodsModel fit, local dependence, and DIF were examined for 51 items initially considered for the Patient-Reported Outcomes Measurement Information System® (PROMIS®) Depression item bank. Samejima’s graded response model was used to examine how well each item measured severity levels of depression and how well it distinguished between individuals with high and low levels of depression. Two short forms were constructed based on psychometric properties and consensus discussions with instrument developers, including psychometricians and content experts. Calibrations presented here are for didactic purposes and are not intended to replace official PROMIS parameters or to be used for research.ResultsOf the 51 depression items, 14 exhibited local dependence, 3 exhibited DIF for gender, and 9 exhibited misfit, and these items were removed from consideration for short forms. Short form 1 prioritized content, and thus items were chosen to meet DSM-V criteria rather than being discarded for lower discrimination parameters. Short form 2 prioritized well performing items, and thus fewer DSM-V criteria were satisfied. Short forms 1–2 performed similarly for model fit statistics, but short form 2 provided greater item precision.ConclusionsIRT is a family of flexible models providing item- and scale-level information, making it a powerful tool for scale construction and refinement. Strengths of IRT models include placing respondents and items on the same metric, testing DIF across demographic or clinical subgroups, and facilitating creation of targeted short forms. Limitations include large sample sizes to obtain stable item parameters, and necessary familiarity with measurement methods to interpret results. Combining psychometric data with stakeholder input (including people with lived experiences of the health condition and clinicians) is highly recommended for scale development and evaluation.

Highlights

  • This paper is part of a series comparing different psychometric approaches to evaluate patient-reported outcome (PRO) measures using the same items and dataset

  • We applied item response theory (IRT) modeling to the measurement of depression as part of a series of papers comparing different psychometric methodologies for evaluating Patient-reported outcome (PRO) measures (IRT, classical test theory, and Rasch modeling)

  • IRTPRO software [22] was used to examine model fit, local independence, and Differential item functioning (DIF) of 51 depression items developed by the Patient-reported outcomes measurement information system (PROMIS) initiative [23]

Read more

Summary

Introduction

This paper is part of a series comparing different psychometric approaches to evaluate patient-reported outcome (PRO) measures using the same items and dataset. Systematically administered PROs improve detection of symptoms, and enhance clinician-patient communication and patients’ satisfaction with care [7,8,9,10]. Given this variety of uses, robust PRO measure development and evaluation is critical. There is a large literature on applying statistical methods such as item response theory (IRT) to develop and evaluate PRO measures (e.g., [15,16,17,18,19,20]). This paper is part of a series comparing psychometric approaches (IRT, classical test theory, and Rasch analysis) using the same items and dataset [21]

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.