e23253 Background: This study attempted to address limitations in the A-PROACCT model developed by Stein et al. (JOP, 2023) to predict the rate of acute care events within 30 days (ACE30) of initial chemotherapy administration (ICA). The four most important predictors were drug category (immunotherapy), cancer type (gastrointestinal), age group ( < 40), and BMI category (underweight). Only linear relationships were considered. For patients for whom an ACE30 was predicted by their first and second models, only 0.18 and 0.23 actually had an ACE30, a low positive predictive value (PPV). We attempted to improve the PPV with advanced modeling techniques. Methods: Using Orlando Health data between February of 2012 and April 2021, random sampling was used to split the dataset into training (10,838) and validation (4645). For the training data, resampling was employed to obtain a balanced quota, equal observations followed by an ACE30 and not followed by an ACE30. We included only variables used in Stein (drug category, age group, ED visits and hospitalizations in the prior year, cancer type, insurance category, number of anti-cancer agents, race, and BMI category). Logistic regression with L1-penalty, XGBoost (a non-linear nonparametric tree-based algorithm to account for nonlinear relationships), and artificial neural networks were used to develop and evaluate predictive models. Different sampling methods (bootstrap and SMOTE) as well as cuff-off thresholds for high-risk groups were tested. Results: Based on evaluation with validation data set, the best performing approach was an XGBoost model with SMOTE resampling in the training data. The four most important predictors were ED visits in the prior year, payor category (self-insured), cancer type (genitourinary) and hospitalization in past year. This model reported a PPV of 0.27. Many other combinations of methodologies described above were performed, and the PPV varied between 0.19 to 0.27. Of the 282 ICAs identified as high-risk by the best model, 76 (27.0%) had an ACE30. Of the 4346 ICAs identified as low-risk, 533 (12.3%) had an ACE30. The differences between high risk and low risk ACE30 rates were statistically significant (p < 0.0001). In comparison, using Stein et al. models with our data reported a PPV of 0.27 and 0.25 which are basically at the same level of our best trained model. Conclusions: Using the variables included in A-PROACCT and a variety of machine learning models, advanced sampling methodologies and threshold cut-off limits, the best results were similar to those obtained using basic logistic regression. This suggests improvement in predicting acute care events following chemotherapy administration need to incorporate more extensive multimodal data, such as vital signs, performance status, patient reported outcomes, socioeconomic factors, laboratory and radiology results, remote patient monitoring and wearable device data.
Read full abstract