Abstract ERG:TMPRSS2 fusion is present in almost 50% of prostate cancer (PCa) cases of European ascent and plays an important role in carcinogenesis and disease progression. ERG status is detected using fluorescence in situ hybridization (FISH) or reverse transcription-polymerase chain reaction (RT-PCR), and since these tests are costly and require special training, there is need for innovative tools to decrease the cost and streamline the diagnostic process. For this reason, we have developed a deep learning (DL) system capable of inferring ERG fusion status using only digitized hematoxylin and eosin (H&E)-stained slides from PCa patients, and detecting tissue regions of high diagnostic relevance. To develop the model, we used the PCa TCGA dataset which includes 436 formalin-fixed paraffin-embedded (FFPE) H&E-stained whole slide images (WSIs) from 393 PCa patients who underwent radical prostatectomy. Subsequently, we evaluated the model’s performance on an independent cohort of 314 WSIs provided by the Johns Hopkins University (natural history cohort). Slides were divided into tiles of 500 × 500 px from which feature extraction was performed using a pre-trained ResNet50 model. Feature vectors were then used to train attention-based multiple instance learning framework to predict the slide-level label as either ERG-positive or negative, and to score slide regions based on their contribution to the slide-level representation. Our model can predict the ERG fusion status with an Area Under the ROC Curve (AUC) of 0.84 and 0.73 in the training data and the independent testing cohort, respectively. Also, the model detects tissue regions with high attention score for each class. To decipher the cellular composition in these highly relevant regions for the cases predicted as ERG-positive or negative, we used HoVer-Net model to perform nuclear segmentation and classification into five categories: benign epithelial, tumor, stroma, immune, and necrotic. Notably, we found that the cellular composition of the highly relevant patches can capture prognostic information. Specifically, In the TCGA dataset, a high ratio of neoplastic cells in the relevant patches was significantly associated with worse progression free survival (PFS), while high ratios of necrotic, stromal, and stromal to neoplastic cells were significantly associated with better PFS. Similar findings were also obtained in the natural history cohort in which a high ratio of neoplastic cells was significantly associated with worse overall survival (OS) and metastasis free survival (MFS), while high ratios of immune, stromal, and stromal to neoplastic cells were significantly associated with longer OS and MFS. These results show that ERG fusion status can be inferred from H&E-stained WSIs, ultimately demonstrating the benefit of DL systems in extracting tissue morphological features of high diagnostic and prognostic relevance. Citation Format: Mohamed Omar, Zhuoran Xu, Sophie B. Rand, Daniela C. Salles, Edward M. Schaeffer, Tamara L. Lotan, Massimo Loda, Luigi Marchionni. Detection of ERG:TMPRSS2 gene fusion in prostate cancer from histopathology slides using attention-based deep learning. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5369.
Read full abstract