Abstract Background Progress in clinical trials and drug development for paediatric perianal fistulising CD (pfCD) depends upon accurate, reproducible imaging to assess disease activity and measure treatment response. We evaluated the inter- and intra-rater reliability of existing radiologic disease activity indices and anatomical classification systems previously identified as appropriate for assessment of paediatric pfCD.1 Methods A retrospective cohort (N=50) of magnetic resonance imaging (MRI) exams representing a full spectrum of paediatric (<18 years) perianal CD severity was selected by a single expert paediatric radiologist. Methods described by Zou2 were used to estimate sample size. Three separate radiologists (2 paediatric) blinded to clinical information assessed pfCD in the exams using the Magnetic Resonance Novel Index for Fistula Imaging in CD (MAGNIFI-CD), Van Assche Index (VAI), modified VAI (mVAI), Paediatric MRI based Perianal Crohn’s disease (PEMPAC) index, a visual analogue scale (VAS) of overall pfCD disease severity (range, 0-100 mm), as well as with the Park’s and St. James University Hospital anatomic classification systems. Two radiologists read all exams twice, separated by 2 weeks, and 1 radiologist read the exams once (250 total reads). Intra- and inter-rater reliability was evaluated using intraclass correlation coefficients (ICC; equivalent to weighted kappa). Precision of the ICC estimates was quantified with 95% confidence intervals (CIs) obtained using clustered bootstrapping. Degree of reliability was interpreted with benchmarks proposed by Landis and Koch3. Results Median age of children was 13.5 years (IQR 11.4-16.1), and 64% were male. Median Paediatric Crohn’s Disease Activity Index score at time of imaging was 17.5 (IQR 10-32.5). Substantial (ICC>0.61) inter-rater reliability was observed for MAGNIFI-CD, VAI, and mVAI (Table 1), with the highest ICCs observed for the mVAI (0.72 [0.6 to 0.79]). Moderate (ICC>0.41) inter-rater reliability was observed for PEMPAC, St. James University Hospital classification, and the VAS, and fair (ICC>0.21) inter-rater reliability was observed for Parks classification. Intra-rater reliability was nearly perfect (ICC>0.81) for the VAS, and substantial for all indices except for Parks classification, which had moderate (0.60 [0.44 to 0.75]) intra-rater reliability (Table 1). Conclusion Most existing multi-item MRI-based disease activity indices and the St. James Hospital classification were reliable for the assessment of paediatric pfCD in our cohort. Our preliminary results suggest that the disease activity indices may have utility to measure treatment effect in paediatric pfCD clinical trials.
Read full abstract