Abstract
IntroductionSupervised machine learning approaches are increasingly used to analyze clinical data, including in geriatric oncology. This study presents a machine learning approach to understand falls in a cohort of older adults with advanced cancer starting chemotherapy, including fall prediction and identification of contributing factors. Materials and MethodsThis secondary analysis of prospectively collected data from the GAP 70+ Trial (NCT02054741; PI: Mohile) enrolled patients aged â„70 with advanced cancer and â„ 1 geriatric assessment domain impairment who planned to start a new cancer treatment regimen. Of â„2000 baseline variables (âfeaturesâ) collected, 73 were selected based on clinical judgment. Machine learning models to predict falls at three months were developed, optimized, and tested using data from 522 patients. A custom data preprocessing pipeline was implemented to prepare data for analysis. Both undersampling and oversampling techniques were applied to balance the outcome measure. Ensemble feature selection was applied to identify and select the most relevant features. Four models (logistic regression [LR], k-nearest neighbor [kNN], random forest [RF], and MultiLayer Perceptron [MLP]) were trained and subsequently tested on a holdout set. Receiver operating characteristic (ROC) curves were generated and area under the curve (AUC) was calculated for each model. SHapley Additive exPlanations (SHAP) values were utilized to further understand individual feature contributions to observed predictions. ResultsBased on the ensemble feature selection algorithm, the top eight features were selected for inclusion in the final models. Selected features aligned with clinical intuition and prior literature. The LR, kNN, and RF models performed equivalently well in predicting falls in the test set, with AUC values 0.66â0.67, and the MLP model showed AUC 0.75. Ensemble feature selection resulted in improved AUC values compared to using LASSO alone. SHAP values, a model-agnostic technique, revealed logical associations between selected features and model predictions. DiscussionMachine learning techniques can augment hypothesis-driven research, including in older adults for whom randomized trial data are limited. Interpretable machine learning is particularly important, as understanding which features impact predictions is a critical aspect of decision-making and intervention. Clinicians should understand the philosophy, strengths, and limitations of a machine learning approach applied to patient data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.