Abstract Haematoxylin and eosin stained (H&E) histopathological images are widely used as a clinical routine for many cancers for diagnosis, but their interpretation is laborious and has a subjective element. Recent advances in computer vision using deep neural network offers an unprecedented opportunity to quantify patterns on H&E images typically used to classify tumour and normal. It is however possible to associate these traits with molecular alterations to obtain a precise understanding of the molecular underpinnings of histological changes. Here we demonstrate wide-spread associations of quantitative image traits with underlying genetic mutations including TP53, whole genome duplications (WDG) and molecular expression profiles ranging from cell morphology to T-cell infiltration. These traits are then on par with molecular data to predict patient survival. We use Inception-V4, a deep neural network, to extract 1536 image features from 9,754 tumour or normal H&E images of 28 cancers from TCGA (split into 7.9 million tiles of 512 by 512 pixels). Using these features, tissue-specific tumour/normal classification accuracy was high with an average of 0.95, ranging from 0.99 for Thyroid Carcinoma to 0.64 for Head and Neck Squamous Cell Carcinoma. We then used these features to predict molecular alterations using regression models. Out of 104 driver gene mutations, 30 showed an association with histology in at least one cancer (AUC > 0.5). Interestingly, TP53 the most frequent mutated gene in cancer, was found in measurably distinct histology in 10 of 28 cancers. WGD were associated with histology in 24 out of 27 cancers (AUC > 0.5), where 4 cancers showed an AUC as high as 0.8. As expected, presence of WGD also correlated with increased cell nucleus volume. For gene expression, 7.7% showed a positive correlation with image features (FDR < 0.1); 2.7% showed a correlation above 0.6. Interestingly, we detected significant associations for genes indicative of immune cell infiltration. Finally, we used image and clinical data to predict patient overall survival. We obtained a C-index range in [0.64, 1] for 25 cancers, which were comparable to models used expression and clinical data ([0.61, 1]). For 13 cancers, models with image data outperformed those with expression data with an improvement of C-index from 0.05% for Kidney Chromophope to 20% for Esophageal Adenocarcinoma. In this study, we used an established deep learning architecture to quantify tumour histology. These analyses showed a high accuracy of classifying tumour/normal images. We obtained a high accuracy in predicting molecular alterations with image features, indicating that these alterations shape the resulting tumour histology. We showed that image features can be used to predict clinical outcomes for cancer patients with a comparable to sometime higher accuracy compared to expression data, offering an easy and inexpensive method to complement the patients’ prognosis. Citation Format: yu fu, moritz gerstung. Molecular and clinical associations of histopathological image features [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1631.
Read full abstract