344 Background: Genomic classifiers improve post-radical prostatectomy (RP) risk stratification compared to conventional clinical-pathologic parameters, but are tissue-destructive and expensive. In contrast, artificial intelligence algorithms utilizing diagnostic hematoxylin and eosin (H&E)-stained slides for risk stratification conserve tissue and could be made widely available at point-of-care. We compared the predictive output of a deep learning-based algorithm applied to H&E-stained whole slide images (WSI) of prostate tumors to commercial genomic classifiers such as Decipher and Prolaris in three diverse RP cohorts with follow-up for metastasis. Methods: We used subsets of three previously published Johns Hopkins RP cohorts with available genomic classifier data. The Natural History Cohort (n=249, Decipher test, PMID: 26058959) and Case-Cohort (n=210, Prolaris CCP test, PMID: 36006048) both utilized a case-cohort design on the outcome of metastasis and consisted of predominantly self-identified White patients, while the Race Cohort (n=93, Decipher test, PMID: 31969336) included only self-identified Black patients with clinical follow-up. For each cohort, a single representative H&E-stained slide from the dominant tumor nodule at RP was scanned. The DL algorithm utilized tumor detection in H&E-stained WSI followed by a classification model to predict metastasis, with or without clinical parameters (age, race, pre-operative PSA, pathologic T- and N-stage, and margin status). The algorithm was trained (n=197) and validated (n=52) on patients from Natural History Cohort and tested on the Race Cohort and Case-Cohort. Harrell’s C-indices based on unadjusted Cox models for time to metastasis were evaluated. Results: The C-index for the WSI-based deep learning algorithm on the Natural History Cohort validation subset was 0.766 (95% confidence interval [CI]: 0.730-0.802) compared to 0.732 (95% CI:0.698-0.766) for the Decipher score. The C-index for WSI-based deep learning algorithm in the Race Cohort was 0.804 (95% CI: 0.790-0.818) compared to 0.724 (95% CI: 0.721-0.726) for the Decipher score. The C-index for the WSI-based deep learning algorithm in the Case-Cohort was 0.840 (95% CI: 0.829-0.851) compared to 0.801 (95% CI: 0.800-0.802) for the Prolaris CCP score. The C-index values for the deep learning algorithm utilizing WSI plus clinical-pathologic parameters were 0.836 (95% CI: 0.796-0.876), 0.882 (95% CI: 0.869-0.894), and 0.890 (95% CI: 0.878-0.901) for the Natural History Cohort, Race-Cohort, and Case-Cohort, respectively. Conclusions: DL algorithms utilizing WSI with or without clinical-pathologic parameters outperform currently employed genomic classifiers for the prediction of metastasis. Validation in additional multi-institutional and racially diverse cohorts is underway.
Read full abstract