Abstract

ObjectiveThis study introduces a new method to establish clinical thresholds for multi-item tests, based on item response theory (IRT), as an alternative to receiver operating characteristic (ROC) analysis. The performance of IRT method was examined and compared with the ROC method across multiple simulated data sets and in a real data set. Study Design and SettingSimulated data sets (sample size: 1,000) varied in means and variability of the test scores and the prevalence of disease. The true clinical threshold was defined as a predetermined location on the latent trait underlying the questionnaire, with its corresponding expected test score. The real data set (sample size: 295) comprised Hospital Anxiety Depression Scale (HADS) depression scores and Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition major depressive disorder (MDD) diagnoses. ResultsThe IRT method recovered the clinical thresholds without bias, whereas the ROC method identified thresholds that were biased by the prevalence of disease. Mild MDD was clinically diagnosed in 23%, moderate MDD in 12%, and severe MDD in 14% of the participants. The IRT method identified the following HADS depression score thresholds for mild, moderate, and severe MDD: 10.7, 13.2, and 15.1, respectively. ConclusionThe new IRT method identifies clinical thresholds that are unbiased by disease prevalence.

Highlights

  • Psychological constructs, such as depression, are frequently measured using multi-item tests

  • To find the thresholds for Composite International Diagnostic Interview (CIDI)-based mild, moderate, and severe major depressive disorder (MDD), the sample was dichotomized in three different ways: (1) any MDD (‘mild þ MDD’, i.e., mild, moderate, or severe MDD) versus no MDD, (2) moderate-and-severe MDD (‘moderate þ MDD’) versus the rest, and (3) severe MDD versus the rest

  • We have shown that the optimal receiver operating characteristic (ROC) cutoff point coincides with the true clinical threshold only if the prevalence of the condition is 0.5

Read more

Summary

Methods

IRT provides a model for understanding the scoring of items of a questionnaire as the interaction between a latent trait and certain characteristics of the items, notably location and discrimination (see Supplementary file 1, section 1) [10]. A construct, such as depression, can be thought of as a continuous trait [11,12]. The symptoms are thought to be caused by depression, and the items represent indicators of the latent trait depression. The location of a dichotomous item (with response options ‘0’ and ‘1’) refers to the position on the latent trait where the probability of endorsing option ‘1’ is 50%. The IRT models the link between the latent depression trait and the manifest item scores, the sum of which is the test score (i.e., depression score). The latent trait represents the true, but unobservable, level of depression, and the outcome of the scoring process, the depression score, represents an estimate of that level of depression

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call