Background and objectivePositron emission tomography/computed tomography (PET/CT) is recommended as the standard imaging modality for diffuse large B-cell lymphoma (DLBCL) staging. However, many studies have neglected the role of patients' prognostic factors with respect to imaging PET/CT of quantitative features. In this paper, a multi-view learning (MVL) model is established to make full use of both clinical and imaging data to predict the prognosis of DLBCL patients and thereby assist doctors in decision-making.MethodsFeature engineering, including feature extraction, feature screening by recursive feature elimination, and dimensionality reduction by principal component analysis, are successively performed on the clinical data and imaging data of the research subjects to obtain the study data. After dividing the data into training and test sets, an instance weighting method is applied to the training data. Subsequently, kernel mapping is performed on the imaging features and clinical features separately, and this kernel mapping is processed in the new kernel feature space using kernel canonical correlation analysis (KCCA). Lastly, model training is performed on the obtained common kernel subspace using a support vector machine (SVM). The final overall model, named SVM-2view-KCCA (SVM-2 K), was compared with three other multi-view models (Ensemble-SVM, Multi-view maximum entropy discrimination, and canonical correlation analysis). The performance of the model was evaluated on the test data with respect to several dichotomous metrics: accuracy, sensitivity, F1 score, the area under the curve (AUC), and G-mean.ResultsThe SVM model improved AUC by 10.5%, sensitivity by 11.9%, accuracy by 9.8%, F1 score by 9.2%, and G-mean by 7.8% for the DLBCL test data after feature engineering based on dimensionality reduction and instance weighting. In the performance comparison of single-view learning models, the SVM-based integration of clinical and imaging features achieved the best overall performance (AUC = 86.3%, accuracy = 91.6%, sensitivity = 83.2%, F1 = 85.7%, and G-mean = 86.1%). In the comparison of MVL models, SVM-2 K achieved the best overall performance (AUC = 92.1%, accuracy = 96.9%, sensitivity = 90.9%, F1 = 92.8%, and G-mean = 91.4%), and the performance of each MVL model was better than that of the best single-view learning model.ConclusionsMVL models outperformed single-view learning models. Of the MVL models, the proposed SVM-2 K achieved the best overall performance and could accurately predict patient prognosis.
Read full abstract