Abstract

206 Background: In prostate cancer, homologous recombination deficiency (HRD) is associated with poor prognosis, and sensitivity to DNA damaging agents and DNA damage repair (DDR) inhibitors. As new classes of DDR inhibitors become available, identifying patients with HRD will be critical for treatment selection. Here, we present machine learning (ML)-based models trained to predict HRD status directly from hematoxylin and eosin (H&E) whole slide images (WSI). Methods: ML models were trained to predict and segment cells and tissue regions within the tumor microenvironment (TME) using annotated (N=91,021 annotations) WSI of H&E-stained resections from the cancer genome atlas prostate adenocarcinoma (TCGA PRAD) dataset (N=401) and needle core biopsies from a proprietary dataset (N=1,000). Quantified Human Interpretable Features (HIFs) that describe the TME composition were extracted. Three models were trained to predict HRD status using 373 WSI with known HRD score (TCGA PRAD; train N=259, validation N=76, and test N=38). Two models used input from the TME model: An HIF multivariate logistic regression model, and a graph neural network (GNN) where predictions are based on the complex spatial relationships within the TME. An end-to-end (E2E) multiple instance learning model predicted directly from the WSI. Two cutoffs for HRD were defined using Gaussian Mixture Models, resulting in 99 WSI (train N=72, validation N=18, and test N=9) positive for the Genomic Instability (>16 events) cutoff, and 58 WSI (train N=44, validation N=10, test N=4) positive for the Genomic Instability (>22 events) cutoff. An independent validation set of 45 biopsies and 16 resections from a biobank of metastatic castration resistant prostate cancer with HRD status determined by whole-exome sequencing was compared to ML model H&E-based HRD prediction. Results: In the TCGA test set of resection samples, all three models moderately or strongly predicted HRD status, with the HIF model showing the best performance (AUROC 0.87, sensitivity 0.88, specificity 0.62). The same HIF model performed equally well (AUROC 0.85, Sensitivity 0.93, specificity 0.67) in the resection samples from the independent validation set. However, the model performance went down (AUROC 0.69, sensitivity 0.91, specificity: 0.3) when both resection and needle biopsy samples were included, highlighting the importance of a representative training set to achieve robust performance in a real world setting. Further model training and validation with a more diverse dataset is required to accurately assess the performance of the model on needle biopsies. Conclusions: ML models trained on resection prostate cancer samples performed well in predicting HRD status when applied to the same sample type, demonstrating the potential of ML models to predict genomic biomarkers status in surgical specimens for treatment decision.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call