Abstract Background Despite therapeutic advancements, the cellular and molecular heterogeneity of UC hinders effective treatment. We applied machine learning-based approaches to whole scanned images (WSI) of H&E stained slides prepared from formalin-fixed, paraffin embedded intestinal mucosal biopsies from the etrolizumab clinical trials, aiming to identify cellular interactions and molecular signatures that define disease subtypes and treatment responses. We hypothesised that the cellular-basis of non-response could be derived by linking cell-cell relationships with paired transcriptomic profiles. Methods We analysed 2297 WSI paired with molecular data from 1272 UC patients from the etrolizumab PhIII clinical trial including various cohorts, treatment arms, and timepoints. Sections were segmented using a convolutional neural network, extracting 8 cell types and their coordinates. For each WSI, ~500 interaction features from an overlaid network were engineered using our developed software, CellMaskProfiler. The topology of the cell-cell interaction networks allowed clustering of WSI with similar profiles. Cell count features and clinical data enabled the annotation of clusters. Paired molecular data was used to perform differential expression. Results Using the cell-cell topology, we identified 7 distinct UC subtypes characterised by specific cellular interactions, molecular profiles and clinical features resolving into, intact, and neutrophil, eosinophil, plasma, or lymphocyte dominated profiles. WSI with a high proportion of epithelial cells and more connections between epithelial cells types were strongly associated with lower Robarts Histology Index, indicating milder disease or intact epithelium, while strong interaction with neutrophils showed the opposite effect. Looking at different treatment arms allowed to relate baseline biopsy WSI characteristics with probability of treatment response highlighting the probability of response based on initial WSI subtype. Using the cellular topology allowed to characterise the transition between diseased and healthier clusters. Adjusting for cellularity effects and comparing baseline data for response at induction revealed genes characterising response. Looking at transitions between clusters revealed similar genes. Conclusion This study introduces a novel feature engineering approach to unravel the cellular and molecular complexity of WSI in UC. The findings underscore the significance of cellular interactions in UC pathology and provide new insights into the molecular mechanisms underlying different disease states and demonstrates that the structural topology (form) derived from WSI can be effectively linked to the transcriptomic profiles (function) to define probability of remission.
Read full abstract