Machine Learning Can Outperform Ann Arbor Staging in Predicting Survival in Patient with Diffuse Large B-Cell Lymphoma: Analysis of a Large National Cancer Database

Madhan Srinivasan Kumar,Debra Hogue,Taha Mahdi Salih Al-Juhaishi,Ji Hwan Park,Abdul Rafeh Naqash,Veena Gujju

doi:10.1182/blood-2023-187781

Abstract

Introduction: Diffuse Large B-cell Lymphoma (DLBCL) is the most common lymphoma in the world with usually an aggressive clinical course. The Ann Arbor staging system and International Prognostic Index (IPI) commonly utilized in clinical practice for risk stratification have known limitations. Machine learning (ML) has emerged as a promising tool for more comprehensive and deeper data analysis. We sought to utilize the ability of ML to predict survival in DLBCL compared to Ann Arbor staging system using a large national database. Methodology: We employed the ML algorithm XGBoost on the National Cancer Institute's Surveillance, Epidemiology and End Result (SEER) database to predict overall survival (OS) and the lymphoma specific survival (LSS). For prediction analysis, we transformed the survival labels into a simple Boolean format: “alive” represented as 0, “dead” as 1, and “dead (attributable to this cancer diagnosis)” also as 1. We utilized one-hot encoding to convert categorical features and variables into binary vectors. The data set was divided into two parts: training (80%) and test (20%). Further, we split the training set into the actual training set and validation set by using stratified 5-fold cross validation. Hyper-parameter optimization was done within the validation set. A broad range of attributes were utilized by the model for its prediction algorithm. To understand how each attribute contributes to predictions, we calculated its importance score in XGBoost. Results: A total of 64,912 patients with DLBCL were found and their data were extracted. The majority were Caucasian (78.9%) with a median age range of 60 to 69. The model was able to predict OS and LSS, with an area under the curve (AUC) of 0.89 and 0.75 (Figure 1), respectively. Factors selected by the model for survival prediction included presence or absence of B-symptoms, treatment status, and disease stage. For OS and LSS, the model found B symptoms to be the highest contributing factor with an importance score of 0.205 and 0.167, respectively. Other important factors incorporated by the model included age and stage IV for OS, and stage IV and clinically asymptomatic status for LSS. The least important factors were location of the primary lymphoma site and year of diagnosis (Table 1). Conclusion: Machine learning tools can help predict survival in patients with DLBCL and able to challenge current staging systems. Our results warrant validation in future prospective studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine Learning Can Outperform Ann Arbor Staging in Predicting Survival in Patient with Diffuse Large B-Cell Lymphoma: Analysis of a Large National Cancer Database

Abstract

Talk to us

Similar Papers

More From: Blood

Lead the way for us

Similar Papers

Characterization of Clinical Features and Survival of Patients with Autoimmune-Associated Diffuse Large B Cell Lymphoma in a Large Institutional Cohort
Ann Cameron Barr ... Jean L Koff
Blood | VOL. 132
Ann Cameron Barr, et. al.Ann Cameron Barr ... Jean L Koff
29 Nov 2018
Blood | VOL. 132

Localized DLBCL of the Tonsil: A U.S. Population-Based Survival Analysis
Jorge A Florindez ... Juan Pablo Alderuccio
Blood | VOL. 138
Jorge A Florindez, et. al.Jorge A Florindez ... Juan Pablo Alderuccio
05 Nov 2021
Blood | VOL. 138

Addition of tumor bulk to the International Prognostic Index (IPI) does not improve prognostication in diffuse large B-cell Lymphoma (DLBCL)
A W Panwalkar ... F R Loberiza
Journal of Clinical Oncology | VOL. 24
A W Panwalkar, et. al.A W Panwalkar ... F R Loberiza
20 Jun 2006
Journal of Clinical Oncology | VOL. 24

Outcomes of cardiac diffuse large B-cell lymphoma (DLBCL) in the rituximab era
Taha Al-Juhaishi ... Sadeer G Al-Kindi
International Journal of Cardiology | VOL. 339
Taha Al-Juhaishi, et. al.Taha Al-Juhaishi ... Sadeer G Al-Kindi
26 Jul 2021
International Journal of Cardiology | VOL. 339

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine Learning Can Outperform Ann Arbor Staging in Predicting Survival in Patient with Diffuse Large B-Cell Lymphoma: Analysis of a Large National Cancer Database

Abstract

Talk to us

Similar Papers

More From: Blood