Though several nomograms exist, machine learning (ML) approaches might improve prediction of pathologic stage in patients with prostate cancer. To develop ML models to predict pathologic stage that outperform existing nomograms that use readily available clinicopathologic variables. Patients with prostate adenocarcinoma who underwent surgery were identified in the National Cancer Database. Seven ML models were trained to predict organ-confined (OC) disease, extracapsular extension, seminal vesicle invasion (SVI), and lymph node involvement (LNI). Model performance was measured using area under the curve (AUC) on a holdout testing data set. Clinical utility was evaluated using decision curve analysis (DCA). Performance metrics were confirmed on an external validation data set. The ML-based extreme gradient boosted trees model achieved the best performance with an AUC of 0.744, 0.749, 0.816, 0.811 for the OC, ECE, SVI, and LNI models, respectively. The MSK nomograms achieved an AUC of 0.708, 0.742, 0.806, 0.802 for the OC, ECE, SVI, and LNI models, respectively. These models also performed the best on DCA. Findings were consistent on both a holdout internal validation data set as well as an external validation data set. Our ML models better predicted pathologic stage relative to existing nomograms at predicting pathologic stage. Accurate prediction of pathologic stage can help oncologists and patients determine optimal definitive treatment options for patients with prostate cancer.
Read full abstract