Abstract

Abstract Today, pathology imaging is one of the most common and inexpensive diagnostic/prognostic tools used in oncology, while more sophisticated methods such as next generation sequencing (NGS) remain relatively expensive and not routinely used in a clinical setting. Deep convolutional neural networks (CNNs) have emerged as an important image analysis technology enhancing the workflow of pathologists and improving the prediction of patient prognosis and response to treatment​​. Recently, a few attempts have been made to predict molecular features from tissue imaging using CNNs. While these preliminary results are encouraging, there have been no systematic attempts to link Whole Slide Images (WSIs) to transcriptomic profiles. In this study, we developed a cutting-edge deep learning model named HE2RNA, specifically customized for the direct prediction of gene expression from H&E-stained WSIs without need for annotation from pathologists. Our model was trained and tested on 8,725 patients from 28 different cancer types available at The Cancer Genome Atlas (TCGA). HE2RNA accurately predicted the expression of six gene signatures related to well known cancer hallmarks (angiogenesis, hypoxia, DNA repair, cell cycle and immunity) and performed particularly well for signalling pathways involved in immune cell activation. This indicates that suitably designed deep learning models can recognize subtle structures in tissue imaging and relate them to molecular portraits. Moreover, HE2RNA is designed to generate a spatial representation (virtual map) of any well-predicted gene expression overlaying the H&E slide. Such a virtual map was validated on a double-stained H&E/CD3 slide obtained from an independent hepatocellular carcinoma sample. This spatialization could be useful in augmenting the pathologists' workflow by providing a virtual multiplexed staining for each H&E slide while overcoming the technical issues associated with immunohistochemistry. Various important prognostic factors, such as microsatellite instability (MSI), are derived from molecular features. Microsatellite instability refers to the hypermutability of short repetitive genomic sequences caused by impaired DNA mismatch repair. These mutations frequently observed in gastric and colorectal cancer are associated with better response to immunotherapy. We show that the transcriptomic representation learned by our model can be used to improve the performance of MSI status prediction for small datasets of WSI. This type of setting is common since large databases of matched RNA-Seq profiles and WSI are widely available, while databases of matched MSI status and WSI are more scarce. In the future, such technologies could therefore facilitate universal screening of molecular biomarkers and improved identification of patients that could benefit from new therapeutic strategies. Citation Format: Elodie Pronier, Benoît Schmauch, Alberto Romagnoni, Charlie Saillard, Pascale Maillé, Julien Calderaro, Meriem Sefta, Sylvain Toldo, Mikhail Zaslavskiy, Thomas Clozel, Matahi Moarii, Pierre Courtiol, Gilles Wainrib. HE2RNA: A deep learning model for transcriptomic learning from digital pathology [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 2105.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call