Abstract

e17554 Background: The ability to understand and predict at the time of diagnosis the trajectories of prostate cancer patients is critical for deciding the appropriate treatment plan. Evidence-based approaches for outcome prediction include predictive machine learning algorithms that harness health record data. Methods: All our analyses used the Veterans Affairs Clinical Data Warehouse (CDW). We included all individuals with a non-metastatic (early stage) prostate cancer diagnosis between 2002 and 2017 as documented in the CDW cancer registry (N = 111351). Our predictors were demographics (age at diagnosis, race), disease staging parameters abstracted at diagnosis ( Stage grouping AJCC, Gleason score, SEER summary stage) and prostate specific antigen (PSA) laboratory values in the last 5 years prior to diagnosis (last value, the value before last, average, minimum, maximum, rate of the change of the last 2 PSAs and density). The predicted outcome was disease progression at 2 years (N = 3469) and 5 years (N = 6325) defined as metastasis - taking either Abiraterone, Sipuleucel-T, Enzalutamide or Radium 223, registry cancer related death or PSA > 50. We used 4 different machine learning classifiers to train prediction models: random forest, k-nearest neighbor, decision trees, and xgboost all with hyper parameter optimization. For testing, we used two approaches: (1) 20% sample held out at the beginning of the study, and (2) stratified test/train split on the remaining data. Results: The table below shows the performance of the best classifier, xgboost. The top five predictors of disease progression were the last PSA, Gleason Score, maximum PSA, age at diagnosis, and SEER summary stage. The last PSA had a significantly higher contribution than the other predictors. More than one PSA value is important for prediction, emphasizing the need for investigating the PSA trajectory in the period before diagnosis. The models are overall very robust going from outcome at 2 years compared to 5 years. Conclusions: A machine learning based xgboost classifier can be integrated in clinical decision support at diagnosis, to robustly predict disease progression at 2 and 5 years. [Table: see text]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.