Abstract

Abstract Deep learning has become a popular tool for analyzing hematoxylin and eosin (H&E) stained whole slide images (WSIs) and has been utilized to study conserved spatial behaviors across cancers [1]. Deep learning models are black-boxes and difficult to interpret. We propose the concept of an abstract morphological gene, hereafter called mone, defined as the features encoding the morphology at each region of a WSI. Mones averaged over WSIs share distributional similarities with bulk-level expressions. Such similarities allow using tools originally developed for studying gene expressions. We study >22000 H&E slides of 19 cancers of the cancer genome atlas (TCGA) using the inceptionV3 network. TSNE plots suggest mones detect tumors and distinguish tissues of origin. We obtained AUCs above 95% for one-versus-all predictions of mone-based classifiers detecting tumors and tissue of origin. We also obtained cross-classification accuracies comparable to [1] using a mone-based logistic regression model. Differential mone analysis identifies pan-cancer and cancer-specific mones differentiating tumor and normal slides. Mone 893 is a pan-cancer feature and an indicator of dense cellular regions. We identified this mone in breast cancer WSIs and validated its consistent behavior in ovarian and lung cancers. Differential mone analysis comparing formalin-fixed paraffin-embedded (FFPE) and fresh frozen slides identifies deep learning features which may be affected by frozen tissue artifacts. Removing such features is essential for developing models not confounded by tissue artifacts. While recent deep learning models predict expressions from WSIs (see [2] for examples), here, we use them to identify the morphological features mones encode. Integrative mone-gene co-expression analysis suggests mone 893 heavily correlates with genes in integrin signaling and inflammation mediated by cytokines and chemokines pathways in breast cancer. Mone 869 heavily correlates with expression of COL8A1 in ovarian cancer. Expression of collagen genes is associated with poor prognosis and drug resistance in ovarian cancer [3,4]. These findings indicate the significant associations between individual deep learning-defined features and both genetic and prognostic quantifications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call