Abstract

Abstract BACKGROUND Histology is emerging as a key therapeutic endpoint for ulcerative colitis driven by associations between histologic response and long-term outcomes. However, existing scoring systems are subjective and consequently have variable inter- and intra-reader variability. Geboes scoring is a well-established system for ulcerative colitis histologic assessment that has previously been used to define thresholds for histo-endoscopic mucosal improvement (Geboes Score ≤3.1, together with Mayo score 0 or 1) and histologic remission (Geboes Score <2). Here we report the first machine learning (ML)-based prediction of the Geboes Score, and Geboes Score-derived thresholds of histologic improvement and remission, directly from whole slide images (WSI) of hematoxylin and eosin (H&E)-stained mucosal biopsies. METHODS 3,148 WSI were scored by three expert gastrointestinal pathologists and the median consensus score was used to determine the Geboes score for each slide as ground truth. ML models were trained on median consensus scores to predict the Geboes score and subscores for each slide. Model performance vs. pathologist median consensus score was measured using accuracy and the F1 score, which accounts for both false positive and false negative errors. RESULTS The ML-based model performance, measured against median consensus scores of three pathologists, showed strong performance at predicting overall Geboes Score, with a quadratic kappa of 0.89. The model was also able to predict both histologic improvement and histologic remission with high accuracy. For predicting histological improvement as defined by a Geboes Score of ≤3.1, the model showed accuracy of 0.92 and F1 score of 0.92 (Figure 1). For predicting histological remission as defined by a Geboes Score of < 2, the model showed accuracy of 0.91 and F1 score of 0.89 (Figure 2). CONCLUSIONS We report a ML-based approach for predicting Geboes score and Geboes score-based key thresholds of histologic improvement and histologic remission. Model predictions show high accuracy compared to median consensus pathologist scores. This approach may enable standardized, reproducible and accurate prediction of these clinically relevant thresholds to better measure histologic disease activity and treatment response in clinical trials.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.