Background and objectives: Deep learning (DL)-based models for predicting the survival of patients with local stages of breast cancer only use time-fixed covariates, i.e., patient and cancer data at the time of diagnosis. These predictions are inherently error-prone because they do not consider time-varying events that occur after initial diagnosis. Our objective is to improve the predictive modeling of survival of patients with localized breast cancer to consider both time-fixed and time-varying events; thus, we take into account the progression of a patient’s health status over time. Methods: We extended four DL-based predictive survival models (DeepSurv, DeepHit, Nnet-survival, and Cox-Time) that deal with right-censored time-to-event data to consider not only a patient’s time-fixed covariates (patient and cancer data at diagnosis) but also a patient’s time-varying covariates (e.g., treatments, comorbidities, progressive age, frailty index, adverse events from treatment). We utilized, as our study data, the SEER-Medicare linked dataset from 1991 to 2016 to study a population of women diagnosed with stage I–III breast cancer (BC) enrolled in Medicare at 65 years or older as qualified by age. We delineated time-fixed variables recorded at the time of diagnosis, including age, race, marital status, breast cancer stage, tumor grade, laterality, estrogen receptor (ER), progesterone receptor (PR), and human epidermal receptor 2 (HER2) status, and comorbidity index. We analyzed six distinct prognostic categories, cancer stages I–III BC, and each stage’s ER/PR+ or ER/PR− status. At each visit, we delineated the time-varying covariates of administered treatments, induced adverse events, comorbidity index, and age. We predicted the survival of three hypothetical patients to demonstrate the model’s utility. Main Outcomes and Measures: The primary outcomes of the modeling were the measures of the model’s prediction error, as measured by the concordance index, the most commonly applied evaluation metric in survival analysis, and the integrated Brier score, a metric of the model’s discrimination and calibration. Results: The proposed extended patients’ covariates that include both time-fixed and time-varying covariates significantly improved the deep learning models’ prediction error and the discrimination and calibration of a model’s estimates. The prediction of the four DL models using time-fixed covariates in six different prognostic categories all resulted in approximately a 30% error in all six categories. When applying the proposed extension to include time-varying covariates, the accuracy of all four predictive models improved significantly, with the error decreasing to approximately 10%. The models’ predictive accuracy was independent of the differing published survival predictions from time-fixed covariates in the six prognostic categories. We demonstrate the utility of the model in three hypothetical patients with unique patient, cancer, and treatment variables. The model predicted survival based on the patient’s individual time-fixed and time-varying features, which varied considerably from Social Security age-based, and stage and race-based breast cancer survival predictions. Conclusions: The predictive modeling of the survival of patients with early-stage breast cancer using DL models has a prediction error of around 30% when considering only time-fixed covariates at the time of diagnosis and decreases to values under 10% when time-varying covariates are added as input to the models, regardless of the prognostic category of the patient groups. These models can be used to predict individual patients’ survival probabilities based on their unique repertoire of time-fixed and time-varying features. They will provide guidance for patients and their caregivers to assist in decision making.
Read full abstract