Overview of Classical Test Theory and Item Response Theory for the Quantitative Assessment of Items in Developing Patient-Reported Outcomes Measures

Joseph C Cappelleri,J Jason Lundy,Ron D Hays

doi:10.1016/j.clinthera.2014.04.006

Joseph C Cappelleri, J Jason Lundy + Show 1 more

Open Access

PDF Available

https://doi.org/10.1016/j.clinthera.2014.04.006

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundThe US Food and Drug Administration’s guidance for industry document on patient-reported outcomes (PRO) defines content validity as “the extent to which the instrument measures the concept of interest” (FDA, 2009, p. 12). According to Strauss and Smith (2009), construct validity "is now generally viewed as a unifying form of validity for psychological measurements, subsuming both content and criterion validity” (p. 7). Hence, both qualitative and quantitative information are essential in evaluating the validity of measures. MethodsWe review classical test theory and item response theory (IRT) approaches to evaluating PRO measures, including frequency of responses to each category of the items in a multi-item scale, the distribution of scale scores, floor and ceiling effects, the relationship between item response options and the total score, and the extent to which hypothesized “difficulty” (severity) order of items is represented by observed responses. ResultsIf a researcher has few qualitative data and wants to get preliminary information about the content validity of the instrument, then descriptive assessments using classical test theory should be the first step. As the sample size grows during subsequent stages of instrument development, confidence in the numerical estimates from Rasch and other IRT models (as well as those of classical test theory) would also grow. ConclusionClassical test theory and IRT can be useful in providing a quantitative assessment of items and scales during the content-validity phase of PRO-measure development. Depending on the particular type of measure and the specific circumstances, the classical test theory and/or the IRT should be considered to help maximize the content validity of PRO measures.

Full Text