This study aims to evaluate the feasibility and effectiveness of deep transfer learning (DTL) and clinical-radiomics in differentiating thymoma from thymic cysts. Clinical and imaging data of 196 patients pathologically diagnosed with thymoma and thymic cysts were retrospectively collected from center 1. (training cohort: n=137; internal validation cohort: n=59). An independent external validation cohort comprised 68 thymoma and thymic cyst patients from center 2. Region of interest (ROI) delineation was performed on contrast-enhanced chest computed tomography (CT) images, and eight DTLmodelsincluding Densenet 169, Mobilenet V2, Resnet 101, Resnet 18, Resnet 34, Resnet 50, Vgg 13, Vgg 16 were constructed. Radiomics features were extracted from the ROI on the CT images of thymoma and thymic cyst patients, and feature selection was performed using intra-observer correlation coefficient (ICC), Spearman correlation analysis, and least absolute shrinkage and selection operator (LASSO) algorithm. Univariate analysis and multivariable logistic regression (LR) were used to select clinical-radiological features. Six machine learning classifiers, including LR, support vector machine (SVM), k-nearest neighbors (KNN), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Multilayer Perceptron (MLP), were used to construct Radiomics and Clinico-radiologic models. The selected features from the Radiomics and Clinico-radiologic models were fused to build a Combined model. Receiver operating characteristic curve (ROC), calibration curve, and decision curve analysis (DCA) were used to evaluate the discrimination, calibration, and clinical utility of the models, respectively. The Delong test was used to compare the AUC between different models. K-means clustering was used to subdivide the lesions of thymomas or thymic cysts into subregions, and traditional radiomics methods were used to extract features and compare the ability of Radiomics and DTL models to reflect intratumoral heterogeneity using correlation analysis. The Densenet 169 based on DTL performed the best, with AUC of 0.933 (95% CI: 0.875-0.991) in the internal validation cohort and 0.962 (95% CI: 0.923-1.000) in the external validation cohort. The AdaBoost classifier achieved AUC of 0.965 (95% CI: 0.923-1.000) and 0.959 (95% CI: 0.919-1.000) in the internal and external validation cohorts, respectively, for the Radiomics model. The LightGBM classifier achieved AUC of 0.805 (95% CI: 0.690-0.920) and 0.839 (95% CI: 0.736-0.943) in the Clinico-radiologic model. The AUC of the Combined model in the internal and external validation cohorts was0.933 (95% CI: 0.866-1.000) and 0.945 (95% CI: 0.897-0.994), respectively. The results of the Delong test showed that the Radiomics model, DTL model, and Combined model outperformed the Clinico-radiologic model in both internal and external validation cohorts (p-values were 0.002, 0.004, and 0.033 in the internal validation cohort, while in the external validation cohort, the p-values were 0.014, 0.006, and 0.015, respectively). But there was no statistical difference in performance among the three models (all p-values <0.05). Correlation analysis showed that radiomics performed better than DTL in quantifying intratumoral heterogeneity differences between thymoma and thymic cysts. The developed DTL model and the Combined model based on radiomics and clinical-radiologic features achieved excellent diagnostic performance in differentiating thymic cysts from thymoma. They can serve as potential tools to assist clinical decision-making, particularly when endoscopic biopsy carries a high risk.
Read full abstract