Abstract

Abstract Lung squamous cell carcinoma (LSCC) is preceded by the development of bronchial premalignant lesions (PMLs). PMLs progress through a series of histologic grades characterized by molecular and morphologic alterations. To enhance our understanding of PML progression to lung cancer, we are developing deep learning methods to quantitate PML histologic features and localize PML severity along the histologic disease spectrum. We leveraged whole slide images (WSIs) of lung tumors (lung adenocarcinoma, LUAD and LSCC) and non-involved adjacent (referred to as ‘normal’) tissue from the Clinical Proteomic Tumor Analysis Consortium (CPTAC), the National Lung Screening Trial (NLST) and the Cancer Genome Atlas (TCGA). The NLST WSIs were used for training a self-supervised contrastive learning (simCLR) model. A graph isomorphism network (GIN) trained on CPTAC WSIs was developed using the features from simCLR and perform WSI-level predictions on tissue subtype (LUAD, LSCC, normal). Model performance was assessed using area under the receiver operating curves (AUC). The model was used to learn WSI-level features of endobronchial biopsies of PMLs from two datasets (PCGA, n=365 samples spanning all histologic grades, and CIS, n=112, only high-grade samples). The features from the final layer of the GIN were used to compute principal components (PCs) and visualize the relationships between samples using t-SNE and UMAP. To interpret how the GIN processes WSI data, we performed gradient-based class activation mapping (Grad-CAMs) on the graphs. High model performance was observed on the CPTAC test dataset (Accuracy = 87%, AUC >= 0.89) but dropped slightly on the TCGA dataset (Accuracy = 72%, AUC >= 0.75). The GIN-CAMs identified WSI regions that were highly associated with the output label based on expert pathologist annotation on TCGA cases. Clustering revealed distinct groupings of normal, LUAD and LSCC subtypes (PC1 & PC2, p<0.01). Most WSIs from the PCGA data clustered with normal tissues (PC1 & PC2: p<0.01 for PCGA v LUAD, PCGA v LSCC). However, a portion of the CIS samples were closely grouped with LSCC tumor cases (74% of CIS WSIs were classified as LSCC tumors. PC1 & PC2: p<0.01 for CIS v LUAD; PC1: p<0.01 for CIS v LSCC). Additionally, the feature distribution of CIS samples that progressed towards LUSC tumors were significantly different than those that regressed (PC1 & PC3: p < 0.05 for Progressive v Regressive). The GIN recognizes and generates robust features that are distinctly and consistently observed in the different histologic groups across multiple cohorts. These features are also able to stratify progressive and regressive high-grade lesions. The stratification of PMLs is an important step in designing novel interception strategies to prevent the development of lung cancer, and our results suggest that pathology data may be efficacious to include in future biomarkers. Citation Format: Rushin Gindra, Yi Zheng, Regan Conrad, Emily Green, Sarah Mazzilli, Ehab Billatos, Mary Reid, Eric Burks, Vijaya Kolachalama, Jennifer E. Beane. Representation learning for histological profiling of lung squamous premalignant lesions and tumors. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5433.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call