Abstract

BackgroundAlthough many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Statistical or computational intelligent models are powerful tools for assessing prognoses; however, many cannot generate accurate risk (probability) estimates. Thus, probability calibration-based versions of traditional machine learning algorithms are developed in this paper to predict the risk of relapse in patients with DLBCL.MethodsFive machine learning algorithms were assessed, namely, naïve Bayes (NB), logistic regression (LR), random forest (RF), support vector machine (SVM) and feedforward neural network (FFNN), and three methods were used to develop probability calibration-based versions of each of the above algorithms, namely, Platt scaling (Platt), isotonic regression (IsoReg) and shape-restricted polynomial regression (RPR). Performance comparisons were based on the average results of the stratified hold-out test, which was repeated 500 times. We used the AUC to evaluate the discrimination ability (i.e., classification ability) of the model and assessed the model calibration (i.e., risk prediction accuracy) using the H-L goodness-of-fit test, ECE, MCE and BS.ResultsSex, stage, IPI, KPS, GCB, CD10 and rituximab were significant factors predicting the 3-year recurrence rate of patients with DLBCL. For the 5 uncalibrated algorithms, the LR (ECE = 8.517, MCE = 20.100, BS = 0.188) and FFNN (ECE = 8.238, MCE = 20.150, BS = 0.184) models were well-calibrated. The errors of the initial risk estimate of the NB (ECE = 15.711, MCE = 34.350, BS = 0.212), RF (ECE = 12.740, MCE = 27.200, BS = 0.201) and SVM (ECE = 9.872, MCE = 23.800, BS = 0.194) models were large. With probability calibration, the biased NB, RF and SVM models were well-corrected. The calibration errors of the LR and FFNN models were not further improved regardless of the probability calibration method. Among the 3 calibration methods, RPR achieved the best calibration for both the RF and SVM models. The power of IsoReg was not obvious for the NB, RF or SVM models.ConclusionsAlthough these algorithms all have good classification ability, several cannot generate accurate risk estimates. Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms. Our risk model of DLBCL demonstrates good discrimination and calibration ability and has the potential to help clinicians make optimal therapeutic decisions to achieve precision medicine.

Highlights

  • Many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment

  • Probability calibration is an effective method of improving the accuracy of these poorly calibrated algorithms

  • The application of rituximab is a breakthrough in DLBCL, and current studies have shown that rituximab improves survival in almost all DLBCL subgroups [4, 42,43,44]

Read more

Summary

Introduction

Many patients receive good prognoses with standard therapy, 30–50% of diffuse large B-cell lymphoma (DLBCL) cases may relapse after treatment. Diffuse large B-cell lymphoma (DLBCL) remains a clinical challenge due to its heterogeneous manifestations and prognosis [1, 2]. Durable remission can be obtained in more than 50% of cases, relapse still occurs in 30–50% of patients with standard therapy, which dramatically reduces their survival rates [3, 4]. The accurate prediction of the risk of recurrence in DLBCL patients is crucial to clinical decision-making, as it is part of a growing trend toward precision medicine [7]. Given that many cases may have recurrences in 3 years, a model that can predict the 3-year recurrence rate of DLBCL patients is urgently required

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.