P357 Automated assessment of histological disease activity in Ulcerative Colitis using a deep learning algorithm

K Kohli,A Mookhoek,G Cathomas,A Khan,A Lugli,F Müller,I Zlobec

doi:10.1093/ecco-jcc/jjad212.0487

Abstract

Abstract Background Assessment of histological disease activity in patients with ulcerative colitis (UC) is becoming more important. Therefore, the DCA score was developed for use in clinical practice. Distribution (D), chronicity (C) and activity (A) are each scored on a three-point scale with distribution being dependent on the percentage of biopsy area affected by inflammatory changes. A major problem for implementation of any score is interobserver heterogeneity. To overcome this hurdle, we developed a deep learning algorithm to automatically compute the DCA score. Methods A retrospective cohort was created consisting of 117 adult patients with ulcerative colitis from which biopsies were taken during a colonoscopy performed in the setting of routine clinical evaluation (training set of 537 slides). From 75 cohort patients, an additional set of biopsies was available (test set of 299 slides). An expert gastrointestinal pathologist performed the DCA score. The model was trained on H&E-stained whole slide images scanned at 40x (0.243-μm pixel) resolution. Training was performed in a weakly supervised multiple-instance learning setup, using slide-level labels without region of interest annotations. The features from the convolutional neural network-based model trained at 2.5x, 5x, and 10x magnification were incorporated using a random forest classifier for final prediction. Model accuracy was evaluated using the test set. Results The algorithm differentiates between normal and inflamed with an accuracy of 90.9% (Table 1). Model performance from features aggregated across all magnification levels (i.e., 2.5x, 5x, and 10x) was superior to features from individual levels. This can be explained with the help of GradCam heatmaps (Figure 1), where we show that features learned at different magnification levels correspond to different aspects of histological changes in UC. The multiple instance approach did not appear to detect crypt abscesses, which may have resulted in a relatively poor performance in differentiating A1 from A2. The model accuracy for the 9-class DCA was 64.8%. Conclusion The algorithm first appears to assess architectural distortions at lower magnification and then the presence of cellular elements like neutrophils at higher magnification. This closely mirrors the approach of pathologists. The present model automates the scoring process and by supporting the daily diagnostic practice highlights the potential of deep learning technologies in assessment of histological disease activity in UC.

Full Text