Empirical evaluation of variability and multi-institutional generalizability of deep learning survival models: application to renal cancer CT scans.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Empirical evaluation of variability and multi-institutional generalizability of deep learning survival models: application to renal cancer CT scans.

Similar Papers
  • Research Article
  • Cite Count Icon 5
  • 10.21037/qims-23-577
Deep learning radiomics analysis based on computed tomography for survival prediction in gastric neuroendocrine neoplasm: a multicenter study.
  • Dec 1, 2023
  • Quantitative Imaging in Medicine and Surgery
  • Zhihao Yang + 5 more

Survival prediction is crucial for patients with gastric neuroendocrine neoplasms (gNENs) to assess the treatment programs and may guide personalized medicine. This study aimed to develop and evaluate a deep learning (DL) radiomics model to predict the overall survival (OS) in patients with gNENs. The retrospective analysis included 162 consecutive patients with gNENs from two hospitals, who were divided into a training cohort, internal validation cohort (The First Affiliated Hospital of Zhengzhou University; n=108), and an external validation cohort (The Henan Cancer Hospital; n=54). DL radiomics analysis was applied to computed tomography (CT) images of the arterial phase and venous phase, respectively. Based on pretreatment CT images, two DL radiomics signatures were developed to predict OS. The combined model incorporating the radiomics signatures and clinical factors was built through the multivariable Cox proportional hazards (CPH) method. The combined model was visualized into a radiomics nomogram for individualized OS estimation. Prediction performance was assessed with the concordance index (C-index) and the Kaplan-Meier (KM) estimator. The DL-based radiomics signatures based on two phases were significantly correlated with OS in the training (C-index: 0.79-0.92; P<0.01), internal validation (C-index: 0.61-0.86; P<0.01), and external validation (C-index: 0.56-0.75; P<0.01) cohorts. The combined model integrating radiomics signatures with clinical factors showed a significant improvement in predictive performance compared to the clinical model in the training (C-index: 0.86 vs. 0.80; P<0.01), internal validation (C-index: 0.77 vs. 0.71; P<0.01), and external validation (C-index: 0.71 vs. 0.66; P<0.01) cohorts. Moreover, the combined model classified patients into high-risk and low-risk groups, and the high-risk group had a shorter OS compared to the low-risk group in the training cohort [hazard ratio (HR) 3.12, 95% confidence interval (CI): 2.34-3.93; P<0.01], which was validated in the internal (HR 2.51, 95% CI: 1.57-3.99; P<0.01) and external validation cohort (HR 1.77, 95% CI: 1.21-2.59; P<0.01). DL radiomics analysis could serve as a potential and noninvasive tool for prognostic prediction and risk stratification in patients with gNENs.

  • Research Article
  • 10.1158/1557-3265.sabcs24-p4-03-16
Abstract P4-03-16: Comparison of analytical and prognostic performance among various Artificial Intelligence models for Tumor Infiltrating Lymphocytes scoring in Triple Negative Breast Cancer: An independent validation on a prospective cohort
  • Jun 13, 2025
  • Clinical Cancer Research
  • Nikolaos Tsiknakis + 10 more

Introduction: Tumor-infiltrating lymphocytes (TILs) have become a significant biomarker during recent years, showcasing its predictive and prognostic potential for early and metastatic triple-negative breast cancer (TNBC). However, pathologist-read stromal TILs (sTILs) remain a semi-quantitative biomarker, susceptible to inter-observer variability. With the surge in Artificial Intelligence (AI) research, various automated approaches have been proposed to score TILs with the promise to overcome the limitations of manual assessment. However, there is a lack of studies comparing different AI models in both analytical and clinical validity with respect to mimicking the challenges of clinical practice. Methods: In this study, we aimed to investigate the variability among ten AI-based TILs scoring models (seven own-developed machine learning models in QuPath –KNN, Random Forest, Neural Network– and three pre-trained deep learning models –HoverNet Graham et al. Medical Image Analysis 2019, CellViT Hörst et al. Medical Image Analysis 2024, Abousamra et al. Frontiers in Oncology 2022–) with respect to their analytical and clinical validity on internal and external validation sets. The development cohort consisted of diagnostic tissue slides of 79 women with surgically resected primary invasive TNBC tumors diagnosed between 2012 and 2016 from the Yale School of Medicine. An independent prospective set comprising of 215 TNBC patients from Sweden diagnosed between 2010 and 2015, with 4 years median follow-up, was used for assessing the models’ clinical validity. The gold standard of this study regards manual sTILs scoring from two expert pathologists. Results: Moderate correlation in analytical validity (Internal validation set: Spearman’s r= 0.72-0.84, p&amp;lt;0.001; External validation set: Spearman’s r=0.63-0.73, p&amp;lt;0.001) is demonstrated across AI methodologies and training strategies. Training on progressively increasing number of samples improved the correlation with sTILs in internal (10 patients:r=0.79, 20:r=0.81, 30:r=0.82, 40:r=0.84, 50:r=0.83, p&amp;lt;0.001) but not in the external validation sets (10:r=0.70, 20:r=0.68, 30:r=0.70, 40:r=0.68, 50:r=0.73, p&amp;lt;0.001). HoverNet &amp; CellViT achieved the second highest correlation with sTILs in the internal validation set (r=0.83, p&amp;lt;0.001) but second and third to worst in the external validation set (r=0.67 &amp; r=0.64, p&amp;lt;0.001). Variabilities in the distribution of TILs scores were identified across models. Interestingly, eight out of ten models (KNN, RF, NN and HoverNet), even less extensively trained ones, showed statistically significant prognostic potential, with similar and overlapping hazard ratios (HR) in the external validation cohort (Cox regression based on IDFS-endpoint and dichotomized TILs scores at 10%, HRadjusted=0.38-0.50, p&amp;lt;0.047). For reference, manual sTILs demonstrated a HRadjusted=0.43 (p=0.003). Conclusion: Most AI TIL methods demonstrated similar and statistically significant clinical validity, which we believe may be attributed to the intrinsic robustness of TILs as a biomarker. The analytical discrepancies between the AI models should not be overlooked; rather, we believe that there is a need for a large and diverse clinical benchmark dataset to be used for independent model validation ensuring the comparability and reliability of AI tools before integration into the clinical practice. Citation Format: Nikolaos Tsiknakis, Joan Martinez Vidal, Johan Staaf, Ana Bosch, Anna Ehinger, Emma Nimeus, Roberto Salgado, Yalai Bai, David L. Rimm, Johan Hartman, Balazs Acs. Comparison of analytical and prognostic performance among various Artificial Intelligence models for Tumor Infiltrating Lymphocytes scoring in Triple Negative Breast Cancer: An independent validation on a prospective cohort [abstract]. In: Proceedings of the San Antonio Breast Cancer Symposium 2024; 2024 Dec 10-13; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2025;31(12 Suppl):Abstract nr P4-03-16.

  • Research Article
  • Cite Count Icon 24
  • 10.1002/jcsm.13139
Hand grip strength-based cachexia index as a predictor of cancer cachexia and prognosis in patients with cancer.
  • Nov 29, 2022
  • Journal of Cachexia, Sarcopenia and Muscle
  • Hailun Xie + 16 more

The cachexia index is a useful predictor for cancer cachexia and prognostic assessment. However, its use is limited because of high testing costs and complicated testing procedures. Thus, in this study, we aimed to develop a hand grip strength (HGS)-based cancer cachexia index (H-CXI) as a potential predictor of cancer cachexia and prognosis in patients with cancer. Here, 14 682 patients with cancer were studied, including the discovery (6592), internal validation (2820) and external validation (5270) cohorts. The H-CXI was calculated as [HGS (kg)/height (m)2 ×serum albumin (g/L)]/neutrophil-to-lymphocyte ratio. The Kaplan-Meier method was used to create survival curves, and the log-rank test was used to compare time-event relationships between groups. A Cox proportional hazard regression model was used to determine independent risk factors for overall survival (OS). Logistic regression analysis was used to assess the association of the H-CXI with short-term outcomes and cancer cachexia. There was a significant non-linear relationship between the H-CXI and OS in all cohorts. Patients with a low H-CXI had significantly lower OS than those with a high H-CXI in the discovery cohort (6-year survival percentage: 55.72% vs. 76.70%, log-rank P<0.001), internal validation cohort (6-year survival percentage: 55.81% vs. 76.70%, log-rank P<0.001), external validation cohort (6-year survival percentage: 56.05% vs. 75.48%, log-rank P<0.001) and total cohort (6-year survival percentage: 55.86% vs. 76.27%, log-rank P<0.001). Notably, the prognostic stratification effect of the H-CXI in patients with advanced-stage disease was more significant than that in patients with early-stage disease. The multivariate Cox proportional risk regression model confirmed that a low H-CXI negatively affected the prognosis of patients with cancer in the discovery cohort [hazard ratio (HR) 0.75, 95% confidence interval (CI) 0.71-0.80, P<0.001], internal validation cohort (HR 0.79, 95 %CI 0.72-0.86, P<0.001), external validation cohort (HR 0.84, 95% CI 0.79-0.89, P<0.001) and total cohort (HR 0.80, 95% CI 0.77-0.83, P<0.001). Multivariate logistic regression models showed that a low H-CXI was an independent risk factor predicting adverse short-term outcomes and cancer cachexia in patients with cancer. The simple and practical H-CXI is a promising predictor for cancer cachexia and prognosis in patients with cancer.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.landig.2025.100952
The use of advanced machine learning to predict outcomes after atezolizumab plus bevacizumab for advanced hepatocellular carcinoma: a retrospective cohort study.
  • Feb 1, 2026
  • The Lancet. Digital health
  • Mathew Vithayathil + 58 more

Combination immune checkpoint inhibitors are recommended as first-line therapy for advanced hepatocellular carcinoma. However, only a third of patients respond to treatment, and improved approaches to predict response are required. Using baseline clinical data, we aimed to use advanced machine learning models to predict overall survival and progression-free survival in patients with advanced hepatocellular carcinoma receiving atezolizumab plus bevacizumab. This retrospective cohort study was conducted at 24 centres across eight countries. Patients aged 18 years and older with a histological or radiological diagnosis of advanced hepatocellular carcinoma were included; those who had received previous systemic therapy for hepatocellular carcinoma were excluded. All patients received intravenous atezolizumab 1200 mg plus bevacizumab 15 mg/kg once every 3 weeks until disease progression. Seven supervised machine learning models, in combination with 13 feature selection techniques, were trained on 44 baseline clinical variables for the prediction of overall survival and progression-free survival. The three best-performing models, combined with their optimum feature selection techniques, were used to develop ensemble machine learning models for the prediction of overall survival and progression-free survival. The primary outcomes of the study were the predictions of overall survival, progression-free survival, and immunotherapy response using advanced machine learning. k-means clustering was used to stratify patients into two groups: those at low risk and those at high risk of either death (in the overall survival model) or disease progression (in the progression-free survival model). 934 patients who received immunotherapy from May 1, 2018 and were followed up until Oct 1, 2023 were screened, of whom 160 were excluded and 774 were included in the final study. Patients were divided into training (n=339), internal validation (n=146) and external validation (n=289) cohorts. Support vector machine, neural network, and naive Bayes algorithms had the best performance in the prediction of overall survival; for progression-free survival, the highest-performing algorithms were ridge regression, naive Bayes, and logistic regression. In the external validation cohort, the ensemble model for the prediction of overall survival (area under the receiver operating characteristic curve 0·75 [95% CI 0·69-0·81]) significantly outperformed all eight of the tested clinical benchmark variables: Barcelona Clinic Liver Cancer (BCLC) stage (0·54 [0·48-0·61]; p<0·0001), α-fetoprotein (AFP) concentration (0·60 [0·54-0·67]; p=0·0007), albumin-bilirubin (ALBI) grade (0·64 [0·58-0·71]; p=0·0003), neutrophil-to-lymphocyte ratio (0·56 [0·49-0·62]; p<0·0001), platelet-to-lymphocyte ratio (0·51 [0·44-0·58]; p<0·0001), combined ALBI grade and BCLC stage (0·67 [0·60-0·73]; p=0·0074), and two BCLC subclassifications (0·62 [0·55-0·69]; p=0·0007 and 0·61 [0·55-0·68]; p=0·0018). The ensemble model for the prediction of progression-free survival (0·64 [0·59-0·70]) outperformed five of the eight clinical predictors: BCLC stage (0·52 [0·46-0·58]; p<0·0001), neutrophil-to-lymphocyte ratio (0·53 [0·47-0·59]; p=0·0069), platelet-to-lymphocyte ratio (0·54 [0·48-0·60]; p=0·016), and two BCLC subclassifications (0·57 [0·50-0·64]; p=0·020 and 0·55 [0·49-0·62]; p=0·0091); the model did not outperform AFP concentration (0·59 [0·53-0·64]; p=0·14), ALBI grade (0·62 [0·56-0·67]; p=0·44), or combined ALBI grade and BCLC stage (0·59 [0·53-0·66]; p=0·12). For the overall survival model, patients stratified into the low-risk group had significantly longer median overall survival (16·4 months [95% CI 14·2-21·6]) than those in the high-risk group (4·8 months [3·0-6·9]; p<0·0001); similarly, patients stratified by the progression-free survival model into the low-risk group had significantly longer median progression-free survival (8·9 months [7·3-11·1]) than those in the high-risk group (3·7 months [2·9-5·6]; p=0·0021). Our advanced machine learning models, which use routinely collected baseline clinical variables, are robust and externally validated and outperform established clinical biomarkers for predicting clinical outcomes with atezolizumab plus bevacizumab. These data-driven models could be used to stratify patients with hepatocellular carcinoma for personalised treatment strategies. None.

  • Research Article
  • Cite Count Icon 86
  • 10.1016/j.radonc.2020.06.010
A deep learning risk prediction model for overall survival in patients with gastric cancer: A multicenter study
  • Jun 12, 2020
  • Radiotherapy and Oncology
  • Liwen Zhang + 10 more

A deep learning risk prediction model for overall survival in patients with gastric cancer: A multicenter study

  • Research Article
  • Cite Count Icon 1
  • 10.2139/ssrn.3514655
A Deep Learning Risk Prediction Model for Overall Survival in Patients with Gastric Cancer: A Multicenter Study
  • Jan 6, 2020
  • SSRN Electronic Journal
  • Liwen Zhang + 10 more

A Deep Learning Risk Prediction Model for Overall Survival in Patients with Gastric Cancer: A Multicenter Study

  • Research Article
  • Cite Count Icon 4
  • 10.1038/s41598-025-87731-z
Development and validation of a prognostic model for critically ill type 2 diabetes patients in ICU based on composite inflammatory indicators
  • Jan 29, 2025
  • Scientific Reports
  • Lin Liu + 4 more

Type 2 diabetes mellitus (T2DM) is a chronic metabolic disorder, and critically ill patients with T2DM in intensive care unit (ICU) have an increased risk of mortality. In this study, we investigated the relationship between nine inflammatory indicators and prognosis in critically ill patients with T2DM to provide a clinical reference for assessing the prognosis of patients admitted to the ICU. Critically ill patients with T2DM were extracted from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database and divided into training and testing sets (7:3 ratio). An external validation cohort was collected from a single center in China using identical criteria. Logistic and Cox regression analyses were used to evaluate the relationship between nine inflammatory indicators and ICU, 30-day, and 90-day mortality rates. Significant predictive variables were chosen using least absolute shrinkage selection operator (LASSO) regression from logistic regression results, and a prognostic prediction model was built with multivariate logistic regression. The model was validated in both test and external validation sets. A total of 4,783 patients were included for model development and testing; an additional 204 served as the external validation set. The levels of eight inflammatory indicators were significantly correlated with short-term prognosis in critically ill patients with T2DM (P < 0.05 for all). The prediction model showed excellent discrimination performance, with AUC values of 0.825 (95% CI, 0.785–0.864) in the test set and 0.741 (95% CI, 0.630–0.851) in the external validation set. Calibration curves demonstrated strong consistency in both sets. In addition, decision curve analysis showed a net clinical benefit within 1–60% threshold probability in the test set and 10–41% threshold probability in the external validation set. Eight inflammatory indicators were identified as independent risk factors for prognosis in critically ill patients with T2DM. The prediction model showed promising performance in both internal and external validation cohorts, highlighting its potential as a valuable tool for early risk stratification and prediction of the outcomes of personalized treatment strategies in ICU settings.

  • Research Article
  • 10.3389/fneur.2026.1780370
A multicenter clinical nomogram for predicting post-stroke fatigue: development and validation
  • Apr 13, 2026
  • Frontiers in Neurology
  • Xiaoqing Tao + 5 more

Background and purpose Post-stroke fatigue (PSF) is a common and disabling complication after stroke, yet its pathophysiological mechanisms remain unclear and reliable prediction tools are lacking. This study aimed to identify risk factors for PSF and develop a visualized nomogram for early prediction based on clinical and laboratory data. Methods We conducted a retrospective cohort study of stroke patients hospitalized in the Department of Neurology at the First Affiliated Hospital of Chongqing Medical University were randomly split into training ( n = 592) and internal validation ( n = 254) sets. An independent cohort of 440 patients from Nanchong Central Hospital was used as the external validation cohort. Fatigue was assessed at week 4 after admission using the Fatigue Severity Scale (FSS) and Fatigue Assessment Scale (FAS). Demographic, clinical, imaging, and laboratory data were collected. LASSO regression was used for variable selection, followed by multivariate logistic regression to construct a nomogram. Model performance was assessed using the area under the curve (AUC), calibration curves, and decision curve analysis (DCA), with internal and external validation via bootstrapping. Results A total of 846 stroke patients were enrolled and randomly split into training ( n = 592), internal validation ( n = 254) and external validation ( n = 440) sets. Eight independent predictors of PSF were identified: brainstem, basal ganglia, and thalamic lesions, female sex, older age, modified Rankin Scale (mRS) score, white blood cell (WBC) count, and C-reactive protein (CRP) level (all p &amp;lt; 0.05). The nomogram showed good discrimination (AUC: 0.870, 0.862, and 0.672 for training, internal, and external validation sets, respectively), calibration, and clinical utility. Conclusion We developed a clinically applicable nomogram based on routinely available data for early prediction of PSF. The model demonstrated good accuracy and may aid in identifying high-risk patients to guide timely intervention.

  • Research Article
  • 10.3389/fonc.2025.1555824
Visceral adipose tissue in the lesser omentum predicts lymphovascular invasion, perineural invasion and survival in gastric cancer
  • Jun 19, 2025
  • Frontiers in Oncology
  • Ping-Ping Liu + 9 more

BackgroundVisceral adipose tissue is associated with clinical outcomes in patients with cancer. This study aimed to investigate the relationship between preoperative visceral adipose tissue in the lesser omentum and clinical prognosis, as well as lymphovascular invasion (LVI) and perineural invasion (PNI), in patients with gastric cancer (GC).Patients and methodsA total of 943 GC patients who underwent radical surgery across three centers in China were included in the study. The patients were divided into one main cohort (center 1) consisting of 389 cases for the primary set and 165 cases for the internal validation set, as well as two external validation cohorts. Preoperative visceral fat area (VFA) in the lesser omentum was measured through radiological assessments using standard computed tomography. Survival analysis was conducted using Kaplan-Meier plots and Cox proportional risk regression models. Additionally, logistic regression analysis was utilized to identify independent risk factors for LVI and PNI in GC.ResultsPatients with low VFA in the lesser omentum (VFA-lesser omentum) exhibited shorter overall survival compared to those with high VFA-lesser omentum [training set: hazard ratio 0.791, 95% CI 0.665-0.941, p = 0.008; validation set: hazard ratio 0.882, 95% CI 0.792-0.982, p = 0.022]. Furthermore, reduced VFA-lesser omentum was an independent risk factor for LVI (odds ratio [OR] 0.917, 95% CI 0.860-0.978, p = 0.008) and PNI (OR 0.933, 95% CI 0.878-0.990, p = 0.023). The results were confirmed in the internal and external validation sets (both p < 0.05).ConclusionPreoperative VFA-lesser omentum was associated with PNI and LVI. In addition, reduced VFA-lesser omentum predicts poor survival in GC patients.

  • Research Article
  • 10.4103/bc.bc_113_25
Explainable machine learning versus logistic regression for outcome prediction in primary intracerebral hemorrhage: A multicenter radiomics study
  • Feb 23, 2026
  • Brain Circulation
  • Huan Wang + 9 more

Abstract: CONTEXT: Accurate outcome prediction is essential for clinical decisions in intracerebral hemorrhage (ICH) patients. However, whether machine learning (ML) models outperform traditional logistic regression (LR) remains unclear. AIMS: This study aims to compare six ML algorithms with LR in predicting poor 3-month outcomes after primary ICH, using radiomics features from noncontrast computed tomography and clinical data. SETTINGS AND DESIGN: A retrospective study. SUBJECTS AND METHODS: Seven hundred and four primary ICH patients from two centers were allocated into training ( n = 516), internal ( n = 128), and external validation ( n = 60) cohorts. Radiomics features from hematoma regions were extracted to generate a radiomics score (Rad-score). STATISTICAL ANALYSIS USED: The Rad-score and clinical variables were selected for developing one LR and six ML models: random forest (RF), artificial neural network (ANN), AdaBoostM1, Naive Bayes (NB), XGB, and support vector machine (SVM). Model discrimination was assessed by the area under the curve (AUC), and the best-performing ML model was interpreted using Shapley Additive exPlanations (SHAP). RESULTS: In the training cohort, AUCs were 0.849 for LR, 0.897 for RF, 0.885 for XGB, 0.884 for AdaBoostM1, 0.858 for ANN, 0.848 for NB, and 0.839 for SVM. In the internal and external validation cohorts, AUCs ranged from 0.796–0.823 and 0.806–0.858, respectively. The RF model achieved significantly higher AUCs than LR in both training and external validation sets (both P &lt; 0.05). SHAP plots identified Rad-score and National Institutes of Health Stroke Scale as key predictors. CONCLUSIONS: The RF model, integrating radiomic and clinical data, outperformed LR and showed the highest accuracy in predicting poor 3-month outcomes after primary ICH.

  • Research Article
  • 10.3389/fonc.2025.1661212
Personalized ICU mortality assessment by interpretable machine learning algorithms in patients with sepsis combined lung cancer: a population-based study and an external validation cohort
  • Oct 1, 2025
  • Frontiers in Oncology
  • Hongjie Tang + 2 more

PurposeSepsis is a leading cause of mortality, especially among immunocompromised patients with lung cancer. We aimed to establish machine learning (ML) based model to accurately forecast ICU mortality in patients with sepsis combined lung cancer.MethodsWe incorporated patients with sepsis combined lung cancer from Medical Information Mart for Intensive Care IV (MIMIC IV) database. Univariate and multivariate logistic analysis were employed to select variables. Recursive Feature Elimination (RFE) method based on 6 ML algorithms was used for feature selection. We harnessed 13 ML algorithms to construct prediction model, which were assessed by area under the curve (AUC), accuracy, sensitivity, specificity, precision, cross-entropy and Brier scores. The best ML model was constructed to predict ICU mortality, and the predictive results were interpretated by SHapley Additive exPlanations (SHAP) framework.ResultsA sum of 1096 lung cancer patients combined sepsis from MIMIC IV database and 251 patients from the external validation set were included. We utilized 13 clinical variables to establish prediction model for ICU mortality. CatBoost model was identified as the prime prediction model with the highest AUC in the training (0.931 [0.921, 0.945]), internal validation (0.698 [0.673, 0.724]) and external validation (0.794 [0.725, 0.879]) cohorts. Oxford Acute Severity of Illness Score (OASIS) had the greatest influence on ICU mortality according to SHAP interpretation.ConclusionsOur ML models demonstrate excellent accuracy and reliability, facilitating more rigorous personalized prognostic forecast to lung cancer patients combined sepsis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.3390/cancers11111721
Tumor Marker-Based Definition of the Transarterial Chemoembolization-Refractoriness in Intermediate-Stage Hepatocellular Carcinoma: A Multi-Cohort Study
  • Nov 4, 2019
  • Cancers
  • Jun Sik Yoon + 15 more

Background: For patients with hepatocellular carcinoma (HCC), the definition of refractoriness to transarterial chemoembolization (TACE), which might make them a candidate for systemic therapy, is still controversial. We aimed to derive and validate a tumor marker-based algorithm to define the refractoriness to TACE in patients with intermediate-stage HCC. Methods: This multi-cohort study was comprised of patients who underwent TACE for treatment-naïve intermediate-stage HCC. We derived a prediction model for overall survival (OS) using the pre- and post-TACE model to predict tumor recurrence after living donor liver transplantation (MoRAL) (i.e., MoRAL score = 11×√protein induced by vitamin K absence-II + 2×√alpha-fetoprotein), which was proven to reflect both tumor burden and biologic aggressiveness of HCC in the explant liver, from a training cohort (n = 193). These results were externally validated in both an independent hospital cohort (from two large-volume centers, n = 140) and a Korean National Cancer Registry sample cohort (n = 149). Results: The changes in MoRAL score (ΔMoRAL) after initial TACE was an independent predictor of OS (MoRAL-increase vs. MoRAL-non-increase: adjusted hazard ratio (HR) = 2.18, 95% confidence interval (CI) = 1.37–3.46, p = 0.001; median OS = 18.8 vs. 37.8 months). In a subgroup of patients with a high baseline MoRAL score (≥89.5, 25th percentile and higher), the prognostic impact of ΔMoRAL was more pronounced (MoRAL-increase vs. MoRAL-non-increase: HR = 3.68, 95% CI = 1.54–8.76, p < 0.001; median OS = 9.9 vs. 37.4 months). These results were reproduced in the external validation cohorts. Conclusion: The ΔMoRAL after the first TACE, a simple and objective index, provides refined prognostication for patients with intermediate-stage HCC. Proceeding to a second TACE may not provide additional survival benefits in cases of a MoRAL-increase after the first TACE in patients with a high baseline MoRAL score (≥89.5), who might be candidates for systemic therapy.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.cmpb.2025.108683
Prognostic power of radiomics in head and neck cancers: Insights from a meta-analysis.
  • Apr 1, 2025
  • Computer methods and programs in biomedicine
  • Ting-Wei Wang + 6 more

Prognostic power of radiomics in head and neck cancers: Insights from a meta-analysis.

  • Research Article
  • Cite Count Icon 15
  • 10.1177/20552076221143249
Artificial intelligence-enabled electrocardiogram screens low left ventricular ejection fraction with a degree of confidence.
  • Jan 1, 2022
  • Digital health
  • Chun-Ho Lee + 8 more

Artificial intelligence-enabled electrocardiogram has become a substitute tool for echocardiography in left ventricular ejection fraction estimation. However, the direct use of artificial intelligence-enabled electrocardiogram may be not trustable due to the uncertainty of the prediction. The study aimed to establish an artificial intelligence-enabled electrocardiogram with a degree of confidence to identify left ventricular dysfunction. The study collected 76,081 and 11,771 electrocardiograms from an academic medical center and a community hospital to establish and validate the deep learning model, respectively. The proposed deep learning model provided the point estimation of the actual ejection fraction and its standard deviation derived from the maximum probability density function of a normal distribution. The primary analysis focused on the accuracy of identifying patients with left ventricular dysfunction (ejection fraction ≤ 40%). Since the standard deviation was an uncertainty indicator in a normal distribution, we used it as a degree of confidence in the artificial intelligence-enabled electrocardiogram. We further explored the clinical application of estimated standard deviation and followed up on the new-onset left ventricular dysfunction in patients with initially normal ejection fraction. The area under receiver operating characteristic curves (AUC) of detecting left ventricular dysfunction were 0.9549 and 0.9365 in internal and external validation sets. After excluding the cases with a lower degree of confidence, the artificial intelligence-enabled electrocardiogram performed better in the remaining cases in internal (AUC = 0.9759) and external (AUC = 0.9653) validation sets. For the application of future left ventricular dysfunction risk stratification in patients with initially normal ejection fraction, a 4.57-fold risk of future left ventricular dysfunction when the artificial intelligence-enabled electrocardiogram is positive in the internal validation set. The hazard ratio was increased to 8.67 after excluding the cases with a lower degree of confidence. This trend was also validated in the external validation set. The deep learning model with a degree of confidence can provide advanced improvements in identifying left ventricular dysfunction and serve as a decision support and management-guided screening tool for prognosis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 18
  • 10.1002/bjs5.50167
Nomograms predicting survival and recurrence in colonic cancer in the era of complete mesocolic excision
  • Apr 26, 2019
  • BJS Open
  • Y Kanemitsu + 7 more

BackgroundMore extensive lymphadenectomy may improve survival after resection of colonic cancer. Nomograms were created predicting overall survival and recurrence for patients who undergo D2–D3 lymph node dissection, and their validity determined.MethodsThis was a multicentre study of patients with colonic cancer who underwent resection with D2–D3 lymph node dissection in Japan. Inclusion criteria included R0 resection. A training cohort of patients operated on from 2007 to 2008 was analysed to construct prognostic models predicting survival and recurrence. Discrimination and calibration were performed using an external validation cohort from the Japanese colorectal cancer registry (procedures in 2005–2006).ResultsThe training cohort consisted of 2746 patients. Predictors of survival were: age (hazard ratio (HR) 1·04), female sex (HR 0·71), depth of tumour invasion (HR 1·15, 1·22, 2·96 and 3·14 for T2, T3, T4a and T4b respectively versus T1), lymphatic invasion (HR 1·11, 1·15 and 2·95 for ly1, ly2 and ly3 versus ly0), preoperative carcinoembryonic antigen (CEA) level (HR 1·21, 1·59 and 1·99 for 5·1–10·0, 10·1–20·0 and 20·1 and over versus 0–5·0 ng/ml), number of metastatic lymph nodes (HR 1·07), number of lymph nodes examined (HR 0·98) and extent of lymphadenectomy (HR 0·23, 0·13 and 0·11 for D1, D2 and D3 versus D0). Predictors of recurrence were: female sex (HR 0·82), macroscopic type (HR 3·82, 4·56, 6·66, 7·74 and 3·22 for types I, II, III, IV and V versus type 0), depth of invasion (HR 1·25, 2·66, 5·32 and 6·43 for T2, T3, T4a and T4b versus T1), venous invasion (HR 1·43, 3·05 and 4·79 for v1, v2 and v3 versus v0), preoperative CEA level (HR 1·39, 1·43, 1·56 and 1·85 for 5·1–10·0, 10·1–20·0, 20·1–40·0 and 40·1 or more versus 0–5 ng/ml), number of metastatic lymph nodes (HR 1·07) and number of lymph nodes examined (HR 0·98). The validation cohort comprised 4446 patients. The internal and external validated Harrell's C‐index values for the nomogram predicting survival were 0·75 and 0·74 respectively. Corresponding values for recurrence were 0·78 and 0·75.ConclusionThese nomograms could predict survival and recurrence after curative resection of colonic cancer.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant