Abstract
Background Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous non-Hodgkin's lymphoma with great clinical challenge. Machine learning (ML) has attracted substantial attention in diagnosis, prognosis, and treatment of diseases. This study is aimed at exploring the prognostic factors of DLBCL by ML. Methods In total, 1211 DLBCL patients were retrieved from Huaihai Lymphoma Working Group (HHLWG). The least absolute shrinkage and selection operator (LASSO) and random forest algorithm were used to identify prognostic factors for the overall survival (OS) rate of DLBCL among twenty-five variables. Receiver operating characteristic (ROC) curve and decision curve analysis (DCA) were utilized to compare the predictive performance and clinical effectiveness of the two models, respectively. Results The median follow-up time was 43.4 months, and the 5-year OS was 58.5%. The LASSO model achieved an Area under the curve (AUC) of 75.8% for the prognosis of DLBCL, which was higher than that of the random forest model (AUC: 71.6%). DCA analysis also revealed that the LASSO model could augment net benefits and exhibited a wider range of threshold probabilities by risk stratification than the random forest model. In addition, multivariable analysis demonstrated that age, white blood cell count, hemoglobin, central nervous system involvement, gender, and Ann Arbor stage were independent prognostic factors for DLBCL. The LASSO model showed better discrimination of outcomes compared with the IPI and NCCN-IPI models and identified three groups of patients: low risk, high-intermediate risk, and high risk. Conclusions The prognostic model of DLBCL based on the LASSO regression was more accurate than the random forest, IPI, and NCCN-IPI models.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have