Abstract BackgroundHomologous recombination deficiency (HRD), originally described in tumors from patients with germline mutations in BRCA1/2 genes, renders cells sensitive to poly-ADP ribose polymerase inhibitors (PARPi) (1), but can be caused by mutations in other genes and is prevalent across multiple cancer types (2). HRD status is of clinical interest because it can indicate patient eligibility for treatment with PARPi. Currently, HRD status is determined by sequencing to identify BRCA mutations or genomic instability, but this has a high rate of failure (3). In this research study, we apply a deep-learning based computational approach to directly infer HRD status from digitized images of hematoxylin and eosin (H&E) stained histology samples in breast cancer tumors. MethodsDigitized whole slide images (WSI) of 931 H&E stained, formalin-fixed and paraffin-embedded (FFPE) breast adenocarcinoma (BRCA) tumor biopsies from the cancer genome atlas (TCGA) were used to train machine learning (ML) models to identify patients that are HRD based on human-interpretable features (HIFs) and end-to-end (E2E) modeling. To train the models, samples were split into training and validation sets designated either HRD or homologous recombination proficient (HRP) based on a previously generated aggregate HRD score (calculated from regions of loss of heterozygosity, large scale genomic instability, and telomeric allelic imbalance) by genomic analysis of the PanCancerAtlas (2). We applied an untuned HRD score threshold of 45 to assign class labels resulting in 142/931 (15.3%) HRD cases. Board certified pathologists (N=93) annotated tissue regions and cellular foci on the PathAI research platform yielding 65,477 annotations. ML models based on convolutional neural networks were trained to recognize breast cancer cells, lymphocytes, macrophages, plasma cells, fibroblasts, and tissue compartments including cancer epithelium, cancer stroma and necrosis within the H&E stained breast cancer samples. Two pipelines constructed H&E histology-based classifiers of HRD status. A weakly-supervised “end-to-end” model using ResNets extracted features from small image patches with an attention module to aggregate across patches and directly predict HRD status. The HIF-based approach used the tissue segmentation and cell identification classifiers to quantify histological features in the WSI. From the labeled images, we extracted 600 HIFs that capture complex relationships between cell and tissue types. HIFs and patient clinical covariates were applied as input to a Sparse Group Lasso model to predict the HRD status of the associated patients.ResultsML models predicted HRD status from H&E stained WSI. The area under the receiver operating characteristics curve (AUROC) was 0.87 for the HIF model and 0.80 for the E2E model. Both classifiers achieved high sensitivity for HRD status (0.86) with more moderate precision (F1 score HIF: 0.80 and E2E: 0.72). Our HIF with clinical covariates model revealed morphological features that were significantly associated with HRD compared with HRP. HRD samples were enriched for areas of necrosis, stromal fibroblasts, and tumor infiltrating lymphocytes (p< 0.001, Mann-Whitney U test). Conclusions Computational models built with the PathAI research platform identified HRD positive patients directly from routinely collected H&E stained WSIs and identified a histological basis for how mutational signatures impact the tumor microenvironment. Disclaimer: The PathAI platform and HRD model are not intended for diagnostic purposes. 1 Farmer et al., 2005. Nature 14;434(7035):917-212 Knijnenburg et al., 2018. Cell Rep 23, 239–2543 Hoppe et al., 2018. JNCI 110(7): djy0854 Coudray et al., 2018 Nat Med 24:1559-15675 Kather et al., 2019. Nat Med 25: 1054–1056pages Citation Format: Amaro Taylor-Weiner, Aryan Pedawi, Wan Fung Chui, James Diao, Jason Wang, Victoria Mountain, Benjamin Glass, Hunter Elliott, Ilan Wapinski, Michael Montalto, Aditya Khosla, Andrew H. Beck. Deep-learning based prediction of homologous recombination deficiency (hrd) status from histological features in breast cancer; a research study [abstract]. In: Proceedings of the 2020 San Antonio Breast Cancer Virtual Symposium; 2020 Dec 8-11; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2021;81(4 Suppl):Abstract nr PD6-04.
Read full abstract