Ensuring the validity of assessments requires a thorough examination of the test content. Subject matter experts (SMEs) are commonly employed to evaluate the relevance, representativeness, and appropriateness of the items. This article proposes incorporating item response theory (IRT) into model assessments conducted by SMEs. Using IRT allows for the estimation of discrimination and threshold parameters for each SME, providing evidence of their performance in differentiating relevant from irrelevant items, thus facilitating the detection of suboptimal SME performance while improving item relevance scores. Use of IRT was compared to traditional validity indices (content validity index and Aiken's V) in the evaluation of items. The aim was to assess the SMEs' accuracy in identifying whether items were designed to measure conscientiousness or not, and predicting their factor loadings. The IRT-based scores effectively identified conscientiousness items (R2 = 0.57) and accurately predicted their factor loadings (R2 = 0.45). These scores demonstrated incremental validity, explaining 11% more variance than Aiken's V and up to 17% more than the content validity index. Modeling SME assessments with IRT improves item alignment and provides better predictions of factor loadings, enabling improvement of the content validity of measurement instruments.
Read full abstract