ObjectiveThe Affective Reactivity Index (ARI) is widely used to assess young people's irritability symptoms, but youth and caregivers often diverge in their assessments. Such informant discrepancy might be rooted in poor psychometric properties, the differential conceptualization of irritability across informants, or reflect sociodemographic and clinical characteristics. We use an out-of-sample replication approach and leverage longitudinal data, available for a subset of the participants, to test these hypotheses. MethodAcross two independent samples (NCohort-1 = 765, 8–21 years; NCohort-2 = 1910, 6–21 years), we investigate the reliability and measurement invariance of the ARI, examine sociodemographic and clinical predictors of discrepant reporting and probe the utility of a bifactor model for cross-informant integration. ResultsDespite good internal consistency and 6-week-retest-reliability of parent (Cohort-1: α = 0.92, ICC = 0.85; Cohort-2: α = 0.93) and youth forms (Cohort-1: α = 0.88, ICC = 0.78; Cohort-2: α = 0.82), we confirm substantial informant discrepancy in ARI ratings (3 points on a scale from 0 to 12), which is stable over six weeks (ICC = 0.53). Measurement invariance across informants was weak, indicating that parents and youth may interpret ARI items differently. Irritability severity and diagnostic status predicted informant-discrepancy, albeit in opposing directions: higher severity was linked to relative, higher irritability-ratings by youth (Cohort-1: β = −0.06, p < .001; Cohort-2: β = −0.06, p < .001), while diagnoses of Disruptive Mood Dysregulation Disorder (Cohort-1: β = 0.44, p < .001; Cohort-2: β = 0.84, p < .001) and Oppositional Defiant Disorder (Cohort-1: β = 0.41, p < .001; Cohort-2: β = 0.42, p < .001) predicted relative higher irritability-ratings by caregivers. In both datasets, a bifactor model parsing informant-specific from shared irritability-related variance fit the data well (CFI = 0.99, RMSEA = 0.05; N2: CFI = 0.99; RMSEA = 0.04). ConclusionParent and youth ARI reports and their discrepancy are reliable and reflect different interpretations of the scale items; hence they should not be averaged. This finding also suggests that irritability is not a unitary construct. Future work should investigate and model how different aspects of irritability might differ in their impact on the responses of specific informants.
Read full abstract