Abstract

e19318 Background: ECOG PS is a prognostic indicator of outcomes, and scores of 0-1 (good ECOG PS) are often required for clinical trial enrollment. Patients treated in non-trial settings often lack ECOG PS scores limiting the ability of Real World Data from these patients to be used in external control arms (ECAs) or to provide optimal specificity for clinical effectiveness research. Machine Learning can be used to impute ECOG PS scores from other clinical data at various points during treatment. Methods: We developed a series of models using logistic regression (LR) or XGBoost (XGB) that impute ECOG PS at initial diagnosis, metastatic diagnosis and final evaluation using a curated Non-Small Cell Lung Cancer cohort of 31,425 patients with at least one ECOG PS score. Results: AUC-ROC values of up to 0.81 could be obtained for imputing a patient’s final ECOG PS, with lower AUC values when imputing ECOG PS at initial and metastatic diagnosis using large numbers (i.e. thousands) of features. We developed more interpretable models with 110 or 40 features with reduced but still satisfactory AUC, with accuracy of predicting good ECOG PS scores of around 80%. Key features were obtained from lab tests, physical exams, comorbidities, medications, age and metastatic status. The table below shows the results of several of these models. Where the models misclassify ECOG PS, the error was rarely greater than 1 grade. Conclusions: ECOG PS is subjective, suggesting that ML based cohort assignment will be sufficiently accurate to support their use in research. Further work will be required to assess if the ML predicted cohorts have different outcomes. [Table: see text]

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call