Abstract

Abstract Histopathological images are an integral data type for studying cancer. We show pre-trained convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNNs with a common architecture trained on 19 cancer types of The Cancer Genome Atlas (TCGA), analyzing 14459 hematoxylin and eosin scanned frozen tissue images. Our CNNs are based on the Inception-V3 network and classify TCGA pathologist-annotated tumor/normal status of whole slide images in all 19 cancer types with consistently high AUCs (0.995±0.008). Remarkably, CNNs trained on one tissue are effective in others (AUC 0.88±0.11), with classifier relationships recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45±0.16 between classifier pairs on the TCGA test sets. In particular, the TCGA-trained classifiers had average tile-level correlation of 0.52±0.09 and 0.58±0.08 on hold-out TCGA lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) test sets, respectively. These relations are reflected on two external datasets, i.e., LUAD and LUSC whole slide images of Clinical Proteomic Tumor Analysis Consortium. The CNNs trained on TCGA achieved cross-classification AUCs of 0.75±0.12 and 0.73±0.13 on LUAD and LUSC external validation sets, respectively. These CNNs had average tile-level correlations of 0.38±0.09 and 0.39±0.08 on LUAD and LUSC validation sets, respectively. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. This study illustrates pre-trained CNNs can detect tumor features across a wide range of cancers, suggesting presence of pan-cancer tumor features. These shared features allow combining datasets when analyzing small samples to narrow down the parameter search space of CNN models. Citation Format: Javad Noorbakhsh, Saman Farahmand, Ali Foroughi pour, Sandeep Namburi, Dennis Caruana, David Rimm, Mohammad Soltanieh-ha, Kourosh Zarringhalam, Jeffrey H. Chuang. Deep learning identifies conserved pan-cancer tumor features [abstract]. In: Proceedings of the AACR Virtual Special Conference on Artificial Intelligence, Diagnosis, and Imaging; 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-003.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call