Abstract

INTRODUCTION: Accurate prognostic prediction is important for treatment planning in patients with pancreatic cancer (PC). Although clinical and socioeconomic factors impact patient survival, traditional prognostic evaluation methods [e.g., TNM staging] do not incorporate these factors. A more accurate and comprehensive tool is needed. We evaluated machine learning (ML) based analytic models to predict survival of patients with PC using data from the Surveillance, Epidemiology, End-Results (SEER) database. METHODS: The SEER database was accessed to identify patients aged ≥18 yr. with histologically confirmed PC diagnosed between 2004–2015. Demographic, socioeconomic, clinical variables (AJCC stage, site, grade, and size of the tumor, treatment received, survival in months) were extracted and analyzed. Using ML software, an ensemble of predictive analytic models (Classification Tree-based, Bayesian net, Neural net, support vector, and K-nearest neighbor classifier) was built and trained using supervised learning algorithms for prediction of 1-year survival probability. Separate models were built using only baseline variables (A model) and including treatment-related variables [surgery, chemotherapy, and radiotherapy (B model)]. RESULTS: Data on 42,673 patients with PC were used to develop the predictive models [Table 1; mean age, 67 yr.; 52% males; 80% white; stage I/II/III/IV/unknown: 6.0%/27.7%/10.3%/52.5%/3.5%; median survival (range): 6 (2–13) months with 78% surviving ≤12 months]. Ten independent prognostic variables were identified in multivariate Cox regression analysis: year of diagnosis, tumor stage/grade, presence of metastasis, age, tumor size, marital status, insurance status, US region, socio-economic factors, and these variables were entered into the ML models. Figure 1 shows the predictive importance of the different variables used in the two models. AUROC (Area Under the Receiver Operating Characteristics) of the A and B models in predicting 1-year survival probability were 0.804 (95% CI: 0.800–0.808) and 0.832 (95% CI: 0.828–0.836), respectively, which were both significantly higher that of the TNM staging system (AUROC = 0.696; 95%CI: 0.691–0.700); P < 0.0001; Figure 2). CONCLUSION: Using advanced ML techniques for analyzing a large dataset, we developed predictive models that accurately predicted the survival of patients with PC and were superior to the existing TNM staging system. These models may serve as an effective tool for prognostic evaluation of PC in clinical settings.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.