Abstract

Intermediate risk prostate cancer (IRPC) is a heterogeneous disease with various treatment options. Existing efforts to further risk stratify this disease entity were developed using a small a priori determined subset of predictors. Here, we use statistical machine learning that pulls from a larger pool of demographics and clinical features to better predict post-radical prostatectomy (RP) pathologic outcomes in IRPC. Knowing the probability of high risk pathology and thus the likelihood of adjuvant radiation may inform patients deciding between upfront surgery and radiation. We retrospectively reviewed patients who underwent RP from 6/2005 to 5/2015 at our institution and included 1560 patients with clinical IRPC, defined per NCCN. Predictors included demographics, PSA and location-specific biopsy findings. The post-RP surgical pathology variables analyzed were positive margins, Gleason upgrading and pathologic upstaging.MATLAB was used. We imputed missing data and used minority oversampling to balance data. 69 predictors were eligible after pre-processing. We used a combination of feature selection methods to decrease multicollinearity and overfitting. Learning methods included logistic regression (LR) with or without lasso regularization and support vector machine with a Gaussian kernel (SVM) with or without forward feature selection. 10-fold cross validation was used to estimate the performance of our model building methods. Statistical measures are reported with syntax: sensitivity/specificity/positive predictive value (PPV)/negative predictive value (NPV). Positive margins: SVM had excellent results with 98%/95%/95%/98% (AUC ∼1.0). Feature selection chose 11 predictors with 71%/75%/74%/72% (AUC 0.83). LR with or without lasso had poorer outcomes except specificity and NPV with <10%/99%/<34%/87%.Gleason upgrading: SVM performance was again excellent with 95%/89%/90%/95% (AUC 0.97). Feature selection narrowed to 21 predictors with good performance at 85%/73%/75%/83% (AUC 0.87). LR was inferior except in specificity with 24%/93%/47%/82% (AUC 0.68). Lasso selected 6 features with 6%/100%/100%/80% (AUC 0.73).Pathologic upstaging: SVM again was superior except for specificity, with 79%/77%/77%/79% (AUC 0.85). Feature selection resulted in 30 variables with 74%/74%/73%/75% (AUC 0.82). Using LR with or without lasso (39 features) resulted in almost identical performance with 50%/84%/65%/74% (AUC 0.73). We are able to build statistical machine learning models of varying complexity to accurately estimate post-RP pathological outcomes. While overall SVM performance was superior—likely due to its ability to handle high dimensional data—LR had unique advantages, suggesting that an ensemble method may be optimal. Further development of these models, including deep learning, may allow us to inform patients and colleagues whether upfront surgery or radiation would be the optimal choice.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.