Abstract

The most accurate prognostic approach for follicular lymphoma (FL), progression of disease at 24 months (POD24), requires two years’ observation after initiating first-line therapy (L1) to predict outcomes. We applied machine learning to structured electronic health record (EHR) data to predict individual survival at L1 initiation. We grouped 523 observations and 1933 variables from a nationwide cohort of FL patients diagnosed 2006–2014 in the Veterans Health Administration into traditionally used prognostic variables (“curated”), commonly measured labs (“labs”), and International Classification of Diseases diagnostic codes (“ICD”) sets. We compared performance of random survival forests (RSF) vs. traditional Cox model using four datasets: curated, curated + labs, curated + ICD, and curated + ICD + labs, also using Cox on curated + POD24. We evaluated variable importance and partial dependence plots with area under the receiver operating characteristic curve (AUC). RSF with curated + labs performed best, with mean AUC 0.73 (95% CI: 0.71–0.75). It approximated, but did not surpass, Cox with POD24 (mean AUC 0.74 [95% CI: 0.71–0.77]). RSF using EHR data achieved better performance than traditional prognostic variables, setting the foundation for the incorporation of our algorithm into the EHR. It also provides for possible future scenarios in which clinicians could be provided an EHR-based tool which approximates the predictive ability of the most accurate known indicator, using information available 24 months earlier.

Highlights

  • Follicular lymphoma (FL), the most common indolent non-Hodgkin lymphoma [1], accounts for about 20% of non-Hodgkin lymphoma [2,3]

  • Using data from the Veterans Affairs (VA) Cancer Registry System and pharmacy dispensation records from the VA Corporate Data Warehouse, we identified a nationwide cohort of patients with grade 1–3a, stage II–IV FL diagnosed from 1 January 2006 to 31

  • Using the curated clinical variable set, the baseline Cox model yielded a mean AUC of 0.64, while the random survival forests (RSF) model achieved a mean AUC

Read more

Summary

Introduction

Follicular lymphoma (FL), the most common indolent non-Hodgkin lymphoma [1], accounts for about 20% of non-Hodgkin lymphoma [2,3]. Patients with FL have highly heterogeneous prognoses; some patients experience an indolent course of disease, while others endure a more aggressive disease with a trajectory that can include frequent progression, relapse, and early demise [4,5,6]. Patients and clinicians must calibrate therapy choice to the risk posed by FL. Treatment risks could outweigh the benefits [4,9]. In order to apply risk-adapted treatment strategies effectively, clinicians must be able to accurately identify high-risk patients early in the course of disease, but the methods currently available for this task have significant drawbacks

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call