Abstract Background PPARG is a cell lineage determining transcription factor in muscle invasive urothelial carcinoma (MIUC), where high expression is associated with the luminal subtype (1, 2). As FX-909(2), a PPARG inverse agonist, enters the clinic, biomarkers that reflect the luminal subtype will reveal patients with the potential to respond to PPARG inhibition. The determination of luminal status is generally performed via RNA-seq and/or multiple immunohistochemistry stains, which are costly and time-consuming. However, MIUC biopsies are routinely stained with hematoxylin and eosin (H&E). Machine learning (ML)-driven analyses of H&E-stained tissue may enable the identification of patients with luminal MIUC and have advantages over the current molecular approach. Methods H&E-stained slides from 367 unique primary MIUC cases from the TCGA BLCA dataset were split into training (70%), validation (15%), and held-out test (15%) sets by preserving the data distribution of patient metadata. A curated retrospective cohort of 42 localized, stage III-IV primary MIUCs was used as an independent test set. Molecular classification as luminal (luminal papillary, luminal, and luminal infiltrated subtypes) or non-luminal (basal-squamous and neuronal subtypes) was performed and used as ground truth(3). Pretrained artifact and tissue segmentation models were deployed on all images to identify artifact-free areas of cancer and cancer-associated stroma. An end-to-end (E2E) additive multiple instance learning model was trained to identify luminal cases using the training set. Top performing model iterations were compared on the validation set, and the optimal iteration was deployed on both test sets. Results We assessed the performance of our E2E model in predicting luminal status using the molecular subtypes derived from Robertson et al. as ground truth(3). The E2E model showed excellent performance when predicting luminal status in the TCGA validation, TCGA test, and independent test sets (AUROC = 0.96, 0.95, and 0.97, respectively). The accuracy in all three cohorts was 89-90%, with a sensitivity of 0.86-0.96, a specificity of 0.82-0.94, and an F1 score of 0.88-0.9. Conclusions We generated a robust ML model that accurately predicts luminal MIUC using H&E-stained slides. Luminal MIUC is dependent on PPARG, and PPARG inverse agonism represents a promising therapeutic approach for MIUC. Coupled with the first-in-class FX-909 therapeutic entering the clinic, the strong performance of our model highlights the potential for its application as a precision biomarker to identify patients with advanced urothelial carcinoma likely to respond to PPARG inhibition.
Read full abstract