Abstract

Patients undergoing radiation therapy (RT) or chemoradiation (CRT) may require emergency department (ED) evaluation or hospitalization due to treatment or comorbidities. Early identification of these patients may direct supportive care during treatment to prevent such events and improve quality of care while reducing healthcare costs. We previously applied multiple machine learning (ML) techniques (gradient boosted trees [GBT], random forest [RF], and support vector machine [SVM]) based on programmatic electronic health record (EHR) pre-treatment data extraction with good predictive accuracy in identifying these patients, GBT being the most predictive. We present differential algorithmic test performance for the GBT model by disease site in the internal validation cohort. We identified 8,462 outpatient RT courses for adult patients at the Duke Cancer Institute from 2013-2016. Extensive structured pre-treatment EHR data including demographics, encounters and vitals in the year prior to RT and labs in the four weeks prior to RT, medical history by ICD codes, and medications at the start of RT were extracted from Duke’s enterprise data warehouse and programmatically converted to standardized terms. Treatment data such as number of RT fractions and recent or concurrent systemic therapy were also used to train the models. Random training (75%) and test (25%) sets were generated. Machine learning models were trained on the training set for ED visits or admissions and validated on the test set. Model performance for different disease sites was assessed based on the validation receiver operating characteristic area under the curve (ROC AUC). Overall the GBT model had high predictive performance (AUC 0.812 for the entire validation cohort). Specific treatment sites had differential predictive performance in the validation cohort. Strong performance was noted for patients treated for brain metastases (AUC 0.884), gynecologic cancer (0.859), and head and neck cancer (0.816). Good performance was observed for bone metastases (0.744), breast cancer (0.791), and lung cancer (0.744). Performance was less strong in those treated for GI cancers (0.680). A GBT model trained exclusively on GI patients resulted in similar performance (0.687). ML using diverse pre-treatment and treatment data predicts emergency visits and hospitalization for patients undergoing cancer therapy. Performance appears to be variable based on disease site despite ML ability to capture complex interactions between variables. Admission of patients with GI cancers may be related to the diverse subsites it comprises in comparison to other diseases. Continued iterative improvements on this algorithm, external validation, and a prospective trial investigating ML-assisted direction of increased clinical assessments during treatment are planned.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call