From Multimodal Data to Clinical Insight: An Explainable Model for Preoperative Salivary Gland Lesion Diagnosis.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

To develop and validate a multimodal dual-step support vector machine model (SVM-DualNet) for the preoperative three-class classification of salivary gland lesions (SGLs) to support clinical decision-making. We retrospectively collected clinical, conventional ultrasound (CUS), shear wave elastography (SWE), and radiomics features from 284 patients with SGLs. For malignancy discrimination and pleomorphic adenoma identification, linear SVM models based on different modality combinations were constructed and compared. The best-performing binary models were sequentially combined to form SVM-DualNet. SHapley Additive exPlanations (SHAP) were applied for global and case-level interpretation and incorporated into diagnostic assistance. Clinical utility was evaluated by comparing the junior radiologist's diagnostic performance before and after SHAP assistance and by comparison with the senior radiologist. In the test cohort, SVM-DualNet achieved a balanced accuracy of 0.76 and a macro F1 score of 0.82 for the three-class classification. The binary models discriminated malignancy and pleomorphic adenoma with AUCs of 0.90 (95% CI: 0.82-0.97) and 0.85 (95% CI: 0.76-0.94), respectively. SHAP-assisted review improved the junior radiologist's balanced accuracy from 0.55 to 0.70 and macro F1 from 0.57 to 0.75, approaching the senior radiologist's performance. The model provides reliable preoperative classification of SGLs and can assist clinicians in decision-making.

Similar Papers
  • Research Article
  • Cite Count Icon 58
  • 10.12659/msm.918452
Ultrasound Computer-Aided Diagnosis (CAD) Based on the Thyroid Imaging Reporting and Data System (TI-RADS) to Distinguish Benign from Malignant Thyroid Nodules and the Diagnostic Performance of Radiologists with Different Diagnostic Experience.
  • Jan 2, 2020
  • Medical Science Monitor
  • Zhuang Jin + 9 more

BackgroundThe diagnosis of thyroid cancer and distinguishing benign from malignant thyroid nodules by junior radiologists can be challenging. This study aimed to develop a computer-aided diagnosis (CAD) system based on the Thyroid Imaging Reporting and Data System (TI-RADS) to distinguish benign from malignant thyroid nodules by analyzing ultrasound images to improve the diagnostic performance of junior radiologists.Material/MethodsA modified TI-RADS based on a convolutional neural network (CNN) was used to develop the CAD system. This retrospective study reviewed 789 thyroid nodules from 695 patients and included radiologists with different diagnostic experience. Five study groups included the CAD group, the junior radiologist group, the intermediate-level radiologist group, the senior radiologist group, and the group in which the junior radiologist used the CAD system. The ultrasound findings were reviewed and compared with the histopathology diagnosis.ResultsThe CAD system for the diagnosis of thyroid cancer showed an accuracy of 80.35%, a sensitivity of 80.64%, a specificity of 80.13%, a positive predictive value (PPV) of 76.02%, a negative predictive value (NPV) of 84.12%, and an area under the receiver operating characteristic (ROC) curve (AUC) of 0.87. The accuracy of the junior radiologists in diagnosing thyroid cancer using CAD was similar to that of intermediate-level radiologists (79.21% vs. 77.57%; P=0.427).ConclusionsThe use of ultrasound CAD based on the TI-RADS showed potential for distinguishing between benign and malignant thyroid nodules and improved the diagnostic performance of junior radiologists.

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.acra.2024.11.045
Development and Validation of Multiparametric MRI-based Interpretable Deep Learning Radiomics Fusion Model for Predicting Lymph Node Metastasis and Prognosis in Rectal Cancer: A Two-center Study
  • May 1, 2025
  • Academic Radiology
  • Yunjun Yang + 10 more

Development and Validation of Multiparametric MRI-based Interpretable Deep Learning Radiomics Fusion Model for Predicting Lymph Node Metastasis and Prognosis in Rectal Cancer: A Two-center Study

  • PDF Download Icon
  • Research Article
  • 10.1186/s12880-025-02061-w
Diagnostic value of shear wave elastography for diabetic peripheral neuropathy: comparison between junior radiologists and senior radiologists.
  • Dec 29, 2025
  • BMC medical imaging
  • Rong-Li Peng + 6 more

Diabetic peripheral neuropathy (DPN) is a prevalent complication of diabetes mellitus, and is often underdiagnosed because of its variable clinical presentation and operator-dependent diagnostic tools. Shear wave elastography (SWE), which quantitatively evaluates tissue stiffness, has the potential to enhance conventional ultrasound by improving diagnostic accuracy and consistency. Nevertheless, a comprehensive analysis examining the extent to which the integration of SWE with conventional ultrasound can enhance the diagnostic performance of radiologists across varying levels of expertise has yet to be performed. In this study, a total of 458 lower extremities from patients with type 2 diabetes were examined via ultrasound and SWE. Four radiologists (two seniors and two juniors) independently assessed the grayscale ultrasound, SWE, and combined images. Diagnostic performance was compared via receiver operating characteristic (ROC) curves and sensitivity and specificity metrics. SWE measurements revealed significantly greater stiffness of the tibial nerve in the DPN group than in the non-DPN group, with values of 37.30kPa versus 25.40kPa (P < 0.001) and corresponding shear wave velocities of 3.54m/s versus 2.90m/s (P < 0.001). The combined images improved diagnostic accuracy across all readers. Notably, junior radiologists exhibited a substantial improvement in terms of sensitivity (ΔSensitivity = 25.565, 95% CI: 18.477-32.653, P = 0.004). In contrast, for the senior radiologists, neither the sensitivity nor the specificity significantly increased with increasing integration SWE. Combining SWE with conventional ultrasound improves the diagnostic accuracy for DPN and helps reduce performance gaps between junior and senior radiologists. SWE may serve as an effective adjunct to support early detection and consistent evaluation of DPN in clinical practice.

  • Research Article
  • Cite Count Icon 2
  • 10.1148/ryai.240786
Adnexal Lesion Discrimination Using Deep Learning Analysis of Dynamic Contrast-enhanced US Images.
  • Nov 5, 2025
  • Radiology. Artificial intelligence
  • Manli Wu + 25 more

Purpose To develop a multimodality deep learning model (Ovarian Cancer Network [OCNet]) using dynamic contrast-enhanced US images to classify adnexal lesions. Materials and Methods This retrospective study included patients with pathologically confirmed adnexal lesions detected at US across 14 hospitals in China between January 2018 and July 2023. Data were divided into the training set (n = 275), internal testing set (n = 57), and external testing set (n = 63). Two deep learning models (OCNetmanual and OCNetautomated) were developed and compared with Ovarian-Adnexal Reporting and Data System (O-RADS) US and the Assessment of Different Neoplasias in the Adnexa (ADNEX) model. Diagnostic performances of radiologists with and without assistance of OCNet were also assessed. Results A total of 395 female patients (median age, 43 years; IQR, 31-55 years) were included (252 benign and 143 malignant lesions). OCNetmanual and OCNetautomated achieved an area under the receiver operating characteristic curve (AUC) of 0.94 (95% CI: 0.89, >0.99) and 0.91 (95% CI: 0.83, 0.99), respectively, outperforming O-RADS US (AUC, 0.79; 95% CI: 0.68, 0.89; P = .002 and P = .03, respectively) and the ADNEX model (AUC, 0.86; 95% CI: 0.77, 0.95; P = .04 and P = .36, respectively). Additionally, the assistance of OCNet enhanced diagnostic performance for junior radiologists, improving the average AUC from 0.86 to 0.94 and the average specificity from 52% to 73%. Conclusion The OCNet model achieved higher performance than O-RADS US and the ADNEX model for classifying adnexal lesions and improved the diagnostic performance of junior radiologists. Keywords: Adnexal Lesion, Deep Learning, Contrast-enhanced US, Multimodal Supplemental material is available for this article. © RSNA 2025 See also commentary by Huber and Adams in this issue.

  • Research Article
  • Cite Count Icon 5
  • 10.21037/gs-22-643
The added value of S-detect in the diagnostic accuracy of breast masses by senior and junior radiologist groups: a systematic review and meta-analysis
  • Dec 1, 2022
  • Gland Surgery
  • Peijun Chen + 6 more

BackgroundS-detect is an emerging computer-aided diagnosis (CAD) technique that provides a reference for radiologists to identify breast cancer. Some studies have shown that US (ultrasound) + S-detect can improve the diagnostic accuracy of junior radiologists more than senior radiologists, but the results are inconsistent in various studies. Therefore, this meta-analysis aimed to assess the value of S-detect combined with the US outcomes from senior and junior radiologists for the diagnosis of breast cancer.MethodsWe searched the PubMed, Cochrane Library, Embase, Web of Science, and Wanfang databases, China Biology Medicine disc, China National Knowledge Infrastructure (CNKI), and VIP database for trials on the diagnostic accuracy of US + S-detect for the diagnosis of breast masses. The search time frame was from the date of establishment of the database to August 20, 2022. Two researchers independently screened the literature, extracted the information, and evaluated the quality of the included literature using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) scale. StataSE 15.1 software was utilized to assess pooled metrics, including sensitivity, specificity, and the area under the curve (AUC).ResultsA total of 19 articles with 3,349 patients and 3,895 breast masses were included in this meta-analysis. Of these, seventeen articles evaluated the diagnostic performance of senior radiologists’ US + S-detect for breast cancer, while twelve articles reported junior radiologists’ diagnostic performance. The risk of bias was primarily attributed to patient selection, flow and timing. In the senior radiologist group, the pooled sensitivity and specificity of US + S-detect were 0.93 [95% confidence interval (CI): 0.89–0.95] and 0.86 (95% CI: 0.80–0.90), respectively, with an AUC of 0.96. As for the junior radiologist group, the pooled sensitivity and specificity of US + S-detect were 0.89 (95% CI: 0.83–0.93) and 0.79 (95% CI: 0.72–0.84), respectively, and the AUC was 0.91.ConclusionsThe results of this meta-analysis showed that the pooled sensitivity and the AUC of both the senior and junior radiologist groups were high, with good diagnostic efficacy and high clinical application. However, the results of this study are highly heterogeneous and need to be validated by collecting more high-quality studies and accumulating a larger sample size.

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s10278-024-01027-8
Diagnostic Accuracy of Ultra-Low Dose CT Compared to Standard Dose CT for Identification of Fresh Rib Fractures by Deep Learning Algorithm.
  • Jul 17, 2024
  • Journal of imaging informatics in medicine
  • Peikai Huang + 8 more

The present study aimed to evaluate the diagnostic accuracy of ultra-low dose computed tomography (ULD-CT) compared to standard dose computed tomography (SD-CT) in discerning recent rib fractures using a deep learning algorithm detection of rib fractures (DLADRF). A total of 158 patients undergoing forensic diagnosis for rib fractures were included in this study: 50 underwent SD-CT, and 108 were assessed using ULD-CT. Junior and senior radiologists independently evaluated the images to identify and characterize the rib fractures. The sensitivity of rib fracture diagnosis by radiologists and radiologist + DLADRF was better using SD-CT than ULD-CT. However, the diagnosis sensitivity of DLADRF using ULD-CT alone was slightly more than SD-CT. Nonetheless, no substantial differences were observed in specificity, positive predictive value, and negative predictive value between SD-CT and ULD-CT by the same radiologist, radiologist + DLADRF, and DLADRF (P > 0.05). The area under the curve (AUC) of receiver operating characteristic indicated that senior radiologist + DLADRF was significantly better than senior and junior radiologists, junior radiologists + DLADRF, and DLADRF alone using SD-CT or ULD-CT (all P < 0.05). Also, junior radiologists + DLADRF was better with ULD-CT than senior and junior radiologists (P < 0.05). The AUC of the rib fracture diagnosed by senior radiologists did not differ from DLADRF using ULD-CT. Also, no significant differences were observed between junior + AI and senior and between junior and DLADRF using SD-CT. DLADRF enhanced the diagnostic performance of radiologists in detecting recent rib fractures. The diagnostic outcomes between SD-CT and ULD-CT across radiologists' experience and DLADRF did not differ significantly.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 7
  • 10.3389/fendo.2024.1299686
Applying machine-learning models to differentiate benign and malignant thyroid nodules classified as C-TIRADS 4 based on 2D-ultrasound combined with five contrast-enhanced ultrasound key frames.
  • Apr 3, 2024
  • Frontiers in Endocrinology
  • Jia-Hui Chen + 5 more

To apply machine learning to extract radiomics features from thyroid two-dimensional ultrasound (2D-US) combined with contrast-enhanced ultrasound (CEUS) images to classify and predict benign and malignant thyroid nodules, classified according to the Chinese version of the thyroid imaging reporting and data system (C-TIRADS) as category 4. This retrospective study included 313 pathologically diagnosed thyroid nodules (203 malignant and 110 benign). Two 2D-US images and five CEUS key frames ("2nd second after the arrival time" frame, "time to peak" frame, "2nd second after peak" frame, "first-flash" frame, and "second-flash" frame) were selected to manually label the region of interest using the "Labelme" tool. A total of 7 images of each nodule and their annotates were imported into the Darwin Research Platform for radiomics analysis. The datasets were randomly split into training and test cohorts in a 9:1 ratio. Six classifiers, namely, support vector machine, logistic regression, decision tree, random forest (RF), gradient boosting decision tree and extreme gradient boosting, were used to construct and test the models. Performance was evaluated using a receiver operating characteristic curve analysis. The area under the curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy (ACC), and F1-score were calculated. One junior radiologist and one senior radiologist reviewed the 2D-US image and CEUS videos of each nodule and made a diagnosis. We then compared their AUC and ACC with those of our best model. The AUC of the diagnosis of US, CEUS and US combined CEUS by junior radiologist and senior radiologist were 0.755, 0.750, 0.784, 0.800, 0.873, 0.890, respectively. The RF classifier performed better than the other five, with an AUC of 1 for the training cohort and 0.94 (95% confidence interval 0.88-1) for the test cohort. The sensitivity, specificity, accuracy, PPV, NPV, and F1-score of the RF model in the test cohort were 0.82, 0.93, 0.90, 0.85, 0.92, and 0.84, respectively. The RF model with 2D-US combined with CEUS key frames achieved equivalent performance as the senior radiologist (AUC: 0.94 vs. 0.92, P = 0.798; ACC: 0.90 vs. 0.92) and outperformed the junior radiologist (AUC: 0.94 vs. 0.80, P = 0.039, ACC: 0.90 vs. 0.81) in the test cohort. Our model, based on 2D-US and CEUS key frames radiomics features, had good diagnostic efficacy for thyroid nodules, which are classified as C-TIRADS 4. It shows promising potential in assisting less experienced junior radiologists.

  • Research Article
  • Cite Count Icon 11
  • 10.1016/j.ultrasmedbio.2021.02.027
Principal component regression-based contrast-enhanced ultrasound evaluation system for the management of BI-RADS US 4A breast masses: objective assistance for radiologists
  • Apr 8, 2021
  • Ultrasound in Medicine &amp; Biology
  • Zi-Mei Lin + 7 more

Principal component regression-based contrast-enhanced ultrasound evaluation system for the management of BI-RADS US 4A breast masses: objective assistance for radiologists

  • Research Article
  • 10.1002/uog.28943
Abstracts of the 34th World Congress on Ultrasound in Obstetrics and Gynecology, 15-18 September 2024, Budapest, Hungary.
  • Sep 1, 2024
  • Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology
  • Y Wu + 1 more

Deep learning (DL) algorithms could improve the classification of ovarian tumours assessed with ultrasound (US) images and clinical information. This study aimed to develop a DL model for diagnosing ovarian tumours based on US images and clinical information, and to compare the performance of the DL model with radiologist assessment. This study retrospectively collected data from four hospitals involving women who underwent US examinations for ovarian tumours. Additionally, data were prospectively randomly collected from two other hospitals. Histopathological analysis served as the reference standard. The retrospective dataset was divided into training, test and internal validation sets. The prospective dataset was used as an external validation set. A DL model was developed. The performance of the DL model was compared with the eleven radiologists’ assessments in the internal and external validation sets. The performance of radiologists' assessments with or without DL model were also compared. A total of 1518 women (6432 images) with ovarian tumours were collected, including 364 women in the internal validation dataset and 340 women in the external validation dataset. In the external validation dataset, the accuracy for the DL model's classification of benign, borderline, and malignant ovarian tumours were 85.5%, which were comparable to the expert assessments (85.3%, p > 0.05), but significantly higher than the average level of eleven different-experience radiologists (72.5%, p < 0.05). The diagnostic performance of junior and mid-level experienced radiologists were significantly improved after applying the DL model. The DL model developed using US images and clinical information can effectively classify benign, malignant, and borderline ovarian tumours with diagnostic performance comparable to the expert assessment and improving junior and mid-level experienced radiologists.

  • Research Article
  • Cite Count Icon 6
  • 10.3389/fendo.2024.1380829
Optimizing evaluation of endometrial receptivity in recurrent pregnancy loss: a preliminary investigation integrating radiomics from multimodal ultrasound via machine learning.
  • Aug 20, 2024
  • Frontiers in endocrinology
  • Shanling Yan + 4 more

Recurrent pregnancy loss (RPL) frequently links to a prolonged endometrial receptivity (ER) window, leading to the implantation of non-viable embryos. Existing ER assessment methods face challenges in reliability and invasiveness. Radiomics in medical imaging offers a non-invasive solution for ER analysis, but complex, non-linear radiomic-ER relationships in RPL require advanced analysis. Machine learning (ML) provides precision for interpreting these datasets, although research in integrating radiomics with ML for ER evaluation in RPL is limited. To develop and validate an ML model that employs radiomic features derived from multimodal transvaginal ultrasound images, focusing on improving ER evaluation in RPL. This retrospective, controlled study analyzed data from 346 unexplained RPL patients and 369 controls. The participants were divided into training and testing cohorts for model development and accuracy validation, respectively. Radiomic features derived from grayscale (GS) and shear wave elastography (SWE) images, obtained during the window of implantation, underwent a comprehensive five-step selection process. Five ML classifiers, each trained on either radiomic, clinical, or combined datasets, were trained for RPL risk stratification. The model demonstrating the highest performance in identifying RPL patients was selected for further validation using the testing cohort. The interpretability of this optimal model was augmented by applying Shapley additive explanations (SHAP) analysis. Analysis of the training cohort (242 RPL, 258 controls) identified nine key radiomic features associated with RPL risk. The extreme gradient boosting (XGBoost) model, combining radiomic and clinical data, demonstrated superior discriminatory ability. This was evidenced by its area under the curve (AUC) score of 0.871, outperforming other ML classifiers. Validation in the testing cohort of 215 subjects (104 RPL, 111 controls) confirmed its accuracy (AUC: 0.844) and consistency. SHAP analysis identified four endometrial SWE features and two GS features, along with clinical variables like age, SAPI, and VI, as key determinants in RPL risk stratification. Integrating ML with radiomics from multimodal endometrial ultrasound during the WOI effectively identifies RPL patients. The XGBoost model, merging radiomic and clinical data, offers a non-invasive, accurate method for RPL management, significantly enhancing diagnosis and treatment.

  • Research Article
  • Cite Count Icon 7
  • 10.1148/radiology.13130561
Two-View versus Single-View Shear-Wave Elastography: Comparison of Observer Performance in Differentiating Benign from Malignant Breast Masses
  • Oct 28, 2013
  • Radiology
  • Su Hyun Lee + 8 more

Purpose To determine whether two-view shear-wave elastography (SWE) improves the performance of radiologists in differentiating benign from malignant breast masses compared with single-view SWE. Materials and Methods This prospective study was conducted with institutional review board approval, and written informed consent was obtained. B-mode ultrasonographic (US) and orthogonal SWE images were obtained for 219 breast masses (136 benign and 83 malignant; mean size, 14.8 mm) in 219 consecutive women (mean age, 47.9 years; range, 20-78 years). Five blinded radiologists independently assessed the likelihood of malignancy for three data sets: B-mode US alone, B-mode US and single-view SWE, and B-mode US and two-view SWE. Interobserver agreement regarding Breast Imaging Reporting and Data System (BI-RADS) category and the area under the receiver operating characteristic curve (AUC) of each data set were compared. Results Interobserver agreement was moderate (κ = 0.560 ± 0.015 [standard error of the mean]) for BI-RADS category assessment with B-mode US alone. When SWE was added to B-mode US, five readers showed substantial interobserver agreement (κ = 0.629 ± 0.017 for single-view SWE; κ = 0.651 ± 0.014 for two-view SWE). The mean AUC of B-mode US was 0.870 (range, 0.855-0.884). The AUC of B-mode US and two-view SWE (average, 0.928; range, 0.904-0.941) was higher than that of B-mode US and single-view SWE (average, 0.900; range, 0.890-0.920), with statistically significant differences for three readers (P ≤ .003). Conclusion The performance of radiologists in differentiating benign from malignant breast masses was improved when B-mode US was combined with two-view SWE compared with that when B-mode US was combined with single-view SWE. © RSNA, 2013 Supplemental material: S1.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 38
  • 10.3389/fonc.2022.897596
Ultrasound-based radiomics XGBoost model to assess the risk of central cervical lymph node metastasis in patients with papillary thyroid carcinoma: Individual application of SHAP
  • Aug 26, 2022
  • Frontiers in Oncology
  • Yan Shi + 10 more

ObjectivesA radiomics-based explainable eXtreme Gradient Boosting (XGBoost) model was developed to predict central cervical lymph node metastasis (CCLNM) in patients with papillary thyroid carcinoma (PTC), including positive and negative effects.MethodsA total of 587 PTC patients admitted at Binzhou Medical University Hospital from 2017 to 2021 were analyzed retrospectively. The patients were randomized into the training and test cohorts with an 8:2 ratio. Radiomics features were extracted from ultrasound images of the primary PTC lesions. The minimum redundancy maximum relevance algorithm and the least absolute shrinkage and selection operator regression were used to select CCLNM positively-related features and radiomics scores were constructed. Clinical features, ultrasound features, and radiomics score were screened out by the Boruta algorithm, and the XGBoost model was constructed from these characteristics. SHapley Additive exPlanations (SHAP) was used for individualized and visualized interpretation. SHAP addressed the cognitive opacity of machine learning models.ResultsEleven radiomics features were used to calculate the radiomics score. Five critical elements were used to build the XGBoost model: capsular invasion, radiomics score, diameter, age, and calcification. The area under the curve was 91.53% and 90.88% in the training and test cohorts, respectively. SHAP plots showed the influence of each parameter on the XGBoost model, including positive (i.e., capsular invasion, radiomics score, diameter, and calcification) and negative (i.e., age) impacts. The XGBoost model outperformed the radiologist, increasing the AUC by 44%.ConclusionsThe radiomics-based XGBoost model predicted CCLNM in PTC patients. Visual interpretation using SHAP made the model an effective tool for preoperative guidance of clinical procedures, including positive and negative impacts.

  • Research Article
  • 10.1016/j.ultrasmedbio.2025.08.010
An Interpretable Radiomics Model Integrating Ultrasound and Clinical Features for Multiclass Classification of Axillary Lymph Nodes.
  • Nov 1, 2025
  • Ultrasound in medicine & biology
  • Yin Zheng + 19 more

An Interpretable Radiomics Model Integrating Ultrasound and Clinical Features for Multiclass Classification of Axillary Lymph Nodes.

  • Research Article
  • Cite Count Icon 8
  • 10.4274/dir.2022.22826
Improved breast lesion detection in mammogram images using a deep neural network.
  • Jul 1, 2023
  • Diagnostic and Interventional Radiology
  • Wen Zhou + 5 more

This study aimed to investigate the effect of using a deep neural network (DNN) in breast cancer (BC) detection. In this retrospective study, a DNN-based model was constructed from a total of 880 mammograms that 220 patients underwent between April and June 2020. The mammograms were reviewed by two senior and two junior radiologists with and without the aid of the DNN model. The performance of the network was assessed by comparing the area under the curve (AUC) and receiver operating characteristic curves for the detection of four features of malignancy (masses, calcifications, asymmetries, and architectural distortions), with and without the aid of the DNN model and by the senior and junior radiologists. Additionally, the effect of utilizing the DNN on diagnosis time for both the senior and junior radiologists was evaluated. The AUCs of the model for the detection of mass and calcification were 0.877 and 0.937, respectively. In the senior radiologist group, the AUC values for evaluation of mass, calcification, and asymmetric compaction were significantly higher with the DNN model than those obtained without the model. Similar effects were observed in the junior radiologist group, but the increase in the AUC values was even more dramatic. The median mammogram assessment time of the junior and senior radiologists was 572 (357-951) s, and 273.5 (129-469) s, respectively, with the DNN model, and the corresponding assessment time without the model, was 739 (445-1003) s and 321 (195-491) s, respectively. The DNN model exhibited high accuracy in detecting the four named features of BC and effectively shortened the review time by both senior and junior radiologists.

  • Research Article
  • Cite Count Icon 1
  • 10.3389/fonc.2025.1616816
Contrast-enhanced CT-based deep learning model assists in preoperative risk classification of thymic epithelial tumors.
  • Jul 31, 2025
  • Frontiers in oncology
  • Xuhui Zhao + 8 more

This study aimed to develop and evaluate a deep learning (DL) model utilizing contrast-enhanced computed tomography (CT) to assist radiologists in accurately stratifying the risk of thymic epithelial tumors (TETs) based on the World Health Organization (WHO) classification. Involved retrospectively enrolling clinical data from 266 patients with histopathologically confirmed TETs from two centers: Center 1 (training set, n=205) and Center 2 (external testing set, n=61). Six DL models (DenseNet 121, ResNet 101, Inception V3, VGG 11, MobileNet V2, and ShuffleNet V2) were developed and evaluated using venous-phase CT images, alongside a traditional radiomic model using a support vector machine (SVM) for comparison. Diagnostic performance of junior and senior radiologists in distinguishing low-risk thymoma (LRT) from high-risk thymoma (HRT) was assessed with and without the assistance of the optimal DL model. The ResNet 101 model emerged as the best performer among six DL models, achieving an AUC of 0.876, accuracy of 0.820, sensitivity of 0.878, specificity of 0.700, positive predictive value of 0.857, and negative predictive value of 0.737 in the external testing set, outperforming the traditional radiomic model (AUC, p < 0.05). Notably, DL model significantly improved junior radiologists' diagnostic performance, with an average AUC of 0.822, approaching senior radiologists' average AUC of 0.859 (p > 0.05). This study demonstrated that a DL model based on contrast-enhanced CT can reliably assist radiologists in preoperative risk stratification of TETs, bridging the diagnostic performance gap between junior and senior radiologists and supporting clinical decision-making.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant