Background: Higher-risk myelodysplastic syndromes (HR-MDS) is a severe and aggressive form of MDS that represents a group of rare and heterogeneous hematologic disorders that primarily affect the bone marrow and blood cells. Treatment typically involves a combination of supportive care and active treatment with targeted therapies, hypomethylating agents, chemotherapy, and hematopoietic stem cell transplant. In this study, we have evaluated real-world treatment patterns in patients with HR-MDS with machine-learning (ML) models to identify clinical and demographic features that may impact decisions about which patients receive systemic therapies and when the therapies are implemented. Methods: We identified 821 patients aged ≥18 years with HR-MDS diagnosed between January 2015 and April 2022, using ConcertAI's RWD360 database linked with open claims data. RWD360 consists of structured records from US-based oncology electronic health record systems. Demographic and clinical data up to 1 year prior to the date of HR-MDS diagnosis were used to develop a family of XGBoost-based ML models to investigate drivers of treatment. We created five models: a time-to-therapy model that predicts patient-level hazard ratios and four classification models that predict likelihood of therapy in the first 2, 3, 6, and 12 months, respectively, following HR-MDS diagnosis (therapy yes/no models). Model performance was evaluated using Harrell's concordance index (C-index) and Akaike information criterion (AIC) for the time-to-therapy model and area under curve (AUC) for the therapy yes/no models. Key features included laboratory values, Charlson comorbidity index (CCI), and prescribed medications. To better capture the temporal characteristics of a patient's disease state evolution, clinical features were organized into time windows alongside the variations within them. We used recursive feature elimination to bring the total features used in the model to 50 to enhance the interpretability and compactness. Five-fold cross-validation (data was split 5 ways; for each fold, a different 20% is left out for validation, while 80% of the data was used to train the model) on the full 821-patient cohort was performed to train and evaluate the model. Results: The median age in this patient cohort was 73 years (range = 21-88 years); approximately 60% were male, and 66% patients were treated in community oncology practices. Only 42.5% patients were recorded to ever receive systemic active therapy, and of those, the median time to therapy was 4.2 months (Figure 1). Patients had a median hemoglobin concentration of 9.4 g/dL (range = 4.4-18.9 g/dL), median CCI of 1 (range = 0-10), and median polypharmacy of 10 (computed on patients who had at least 1 concomitant medication (49%); range = 1-93) around time of diagnosis (assessed 1 year prior, and up to 30 days after diagnosis date). Model performance for the time-to-therapy model (C-index = 0.78; AIC = 3314.2) and for the therapy yes/no models (AUC = 0.7-0.74) were evaluated. A key predictive feature across models was the minimum laboratory value of white blood cell (WBC) count near the point of diagnosis; low WBC values around time of diagnosis was consistently a strong predictor for receiving systemic therapy and receiving it earlier. Two other top predictors of systemic therapy and its timing were the rate of fall in measurements of blood glucose and the rate of fall of hemoglobin during the 180 to 30 days before diagnosis. Table 1 details model performance and the top 3 features impacting time and likelihood of therapy for all 5 models. Conclusion: We have built a family of data-driven ML models for patients with HR-MDS, based on data collected in routine clinical practice, towards predicting time-to and probability-of first-line systemic therapy with reasonable accuracy driven by patient characteristics. A majority of patients diagnosed with HR-MDS in this study appeared to receive delayed or no systemic therapy in their treatment journey. To bridge gaps in care and facilitate timely treatment in eligible patients, data-driven predictive modeling that take into account detailed patient-level factors is critical. Further analysis on outcomes of untreated patients linked with their key clinical characteristics identified using these models may provide opportunities to improve clinical care by implementing strategies for treatments based on patient-level factors.
Read full abstract