1614 Background: Despite the unique ability of machine learning (ML) techniques to uncover complex interactive effects in datasets, studies exploring predictors of clinical trial enrollment have largely used standard statistical methods to discern associations. Furthermore, there have been limited applications of ML to close existing disparities in clinical trial enrollment of underrepresented patients in neuro-oncology. This study aims to use supervised machine learning to define and validate primary contributors of therapeutic clinical trial enrollment for a) all patients, b) women and c) NIH-designated minority patients with low- and high-grade glioma. Methods: Adult glioma patients who received care from the UCSF Brain Tumor Center between 1997- 2017 were identified in a prospective registry. Bootstrap forest (BF) and Recursive Partitioning (RP) models were created by randomly dividing patients in a 70/30 split into development (DEV) and validation (VAL) cohorts among all patients, women, and NIH-designated minority patients, separately. Model performance was assessed using the area under the curve (AUC) in the DEV and VAL cohorts. Results: Among 1042 patients, 350 patients (33.6%) enrolled in a therapeutic clinical trial. There were 445 women (42.7%), of which 144 (32.4%) enrolled, and 141 minority patients (13.5%), of which 39 (27.7%) enrolled. For all patients, in order of decreasing influence, the BF model selected median neighborhood household income, age, neighborhood poverty level, distance from hospital, tumor location, tumor volume, treatment with chemotherapy, occupation, KPS, insurance status, employment status, seizure at presentation, and volumetric extent of resection (EOR) to predict trial enrollment (DEV AUC: 0.984; VAL AUC: 0.746). For women, the BF model selected distance from hospital, age, household income, neighborhood poverty level, tumor location, treatment with chemotherapy, and KPS for trial prediction (DEV AUC: 0.972; VAL AUC: 0.746), whereas for minority patients, the BF model selected neighborhood poverty level, age, occupation, distance from hospital, tumor volume, insurance status, employment status, and EOR to predict trial enrollment (DEV AUC: 0.9830; VAL AUC: 0.769). The RP model for women designated those with non-White race as less likely to enroll (DEV AUC: 0.735; VAL AUC: 0.665) while the RP model for minority patients characterized those who preferred a non-English language or who were divorced or widowed as less likely to enroll (DEV AUC: 0.775; VAL AUC: 0.601). Conclusions: Supervised machine learning models achieved similar predictive performance among patient subgroups, while uncovering interactions between gender identity, minority status, and sociodemographic variables. These findings can guide targeted recruiting efforts to advance equitable trial enrollment for women and minorities.
Read full abstract