An Interpretable Machine Learning Approach to Predicting Depression and Diabetes

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

An Interpretable Machine Learning Approach to Predicting Depression and Diabetes

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.1200/cci-24-00178
Impact of Tumor Location on Predicting Early-Stage Breast Cancer Patient Survivability Using Explainable Machine Learning Models.
  • Mar 1, 2025
  • JCO clinical cancer informatics
  • Nader Abdalnabi + 5 more

This study aims to investigate the impact of tumor quadrant location on the 5-year early-stage breast cancer survivability prediction using explainable machine learning (ML) models. By integrating these predictive models with Shapley Additive Explanations (SHAP), feature importance, and coefficient effect size, we aim to provide insights into the significant factors influencing patient outcomes. Data from 401 early-stage patients with breast cancer at the University of Missouri's Ellis Fischel Cancer Center were used, encompassing 20 variables related to demographics, tumor characteristics, and therapeutics. Six ML models, namely, Xtreme Gradient Boosting, Random Forest classifier, Logistic Regression, Decision Tree classifier (DT), Support Vector Machine classifier, and AdaBoost (ADB), were trained and evaluated using various performance metrics, including accuracy, sensitivity, specificity, F1-score, area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR). Feature importance, coefficient effect size, and SHAP values were used to interpret and visualize the importance of different features, particularly focusing on tumor quadrant variables. The extreme gradient boosting model outperformed other models, achieving an AUC-ROC score of 0.98 and an AUC-PR score of 0.97. The analysis revealed that tumor quadrant variables, especially the upper outer and miscellaneous or overlapping sites, were among the top predictive features for breast cancer survivability. SHAP analysis further highlighted the significance of these tumor locations in influencing survival outcomes. This study demonstrates the efficacy of explainable ML models in predicting 5-year early-stage breast cancer survivability and identifies tumor quadrant location as an independent prognostic factor. The use of SHAP values provides a clear interpretation of the model's predictions, offering valuable insights for clinicians to refine treatment protocols and improve patient outcomes.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 53
  • 10.3390/jpm12020228
Explainable Machine Learning Model for Predicting First-Time Acute Exacerbation in Patients with Chronic Obstructive Pulmonary Disease.
  • Feb 7, 2022
  • Journal of Personalized Medicine
  • Chew-Teng Kor + 5 more

Background: The study developed accurate explainable machine learning (ML) models for predicting first-time acute exacerbation of chronic obstructive pulmonary disease (COPD, AECOPD) at an individual level. Methods: We conducted a retrospective case–control study. A total of 606 patients with COPD were screened for eligibility using registry data from the COPD Pay-for-Performance Program (COPD P4P program) database at Changhua Christian Hospital between January 2017 and December 2019. Recursive feature elimination technology was used to select the optimal subset of features for predicting the occurrence of AECOPD. We developed four ML models to predict first-time AECOPD, and the highest-performing model was applied. Finally, an explainable approach based on ML and the SHapley Additive exPlanations (SHAP) and a local explanation method were used to evaluate the risk of AECOPD and to generate individual explanations of the model’s decisions. Results: The gradient boosting machine (GBM) and support vector machine (SVM) models exhibited superior discrimination ability (area under curve [AUC] = 0.833 [95% confidence interval (CI) 0.745–0.921] and AUC = 0.836 [95% CI 0.757–0.915], respectively). The decision curve analysis indicated that the GBM model exhibited a higher net benefit in distinguishing patients at high risk for AECOPD when the threshold probability was <0.55. The COPD Assessment Test (CAT) and the symptom of wheezing were the two most important features and exhibited the highest SHAP values, followed by monocyte count and white blood cell (WBC) count, coughing, red blood cell (RBC) count, breathing rate, oral long-acting bronchodilator use, chronic pulmonary disease (CPD), systolic blood pressure (SBP), and others. Higher CAT score; monocyte, WBC, and RBC counts; BMI; diastolic blood pressure (DBP); neutrophil-to-lymphocyte ratio; and eosinophil and lymphocyte counts were associated with AECOPD. The presence of symptoms (wheezing, dyspnea, coughing), chronic disease (CPD, congestive heart failure [CHF], sleep disorders, and pneumonia), and use of COPD medications (triple-therapy long-acting bronchodilators, short-acting bronchodilators, oral long-acting bronchodilators, and antibiotics) were also positively associated with AECOPD. A high breathing rate, heart rate, or systolic blood pressure and methylxanthine use were negatively correlated with AECOPD. Conclusions: The ML model was able to accurately assess the risk of AECOPD. The ML model combined with SHAP and the local explanation method were able to provide interpretable and visual explanations of individualized risk predictions, which may assist clinical physicians in understanding the effects of key features in the model and the model’s decision-making process.

  • Research Article
  • Cite Count Icon 17
  • 10.1371/journal.pone.0317819
Development and validation of interpretable machine learning models for triage patients admitted to the intensive care unit.
  • Feb 18, 2025
  • PloS one
  • Zheng Liu + 4 more

Developing and validating interpretable machine learning (ML) models for predicting whether triaged patients need to be admitted to the intensive care unit (ICU). The study analyzed 189,167 emergency patients from the Medical Information Mart for Intensive Care IV database, with the outcome being ICU admission. Three models were compared: Model 1 based on Emergency Severity Index (ESI), Model 2 on vital signs, and Model 3 on vital signs, demographic characteristics, medical history, and chief complaints. Nine ML algorithms were employed. The area under the receiver operating characteristic curve (AUC), F1 Score, Positive Predictive Value, Negative Predictive Value, Brier score, calibration curves, and decision curves analysis were used to evaluate the performance of the models. SHapley Additive exPlanations was used for explaining ML models. The AUC of Model 3 was superior to that of Model 1 and Model 2. In Model 3, the top four algorithms with the highest AUC were Gradient Boosting (0.81), Logistic Regression (0.81), naive Bayes (0.80), and Random Forest (0.80). Upon further comparison of the four algorithms, Gradient Boosting was slightly superior to Random Forest and Logistic Regression, while naive Bayes performed the worst. This study developed an interpretable ML triage model using vital signs, demographics, medical history, and chief complaints, proving more effective than traditional models in predicting ICU admission. Interpretable ML aids clinical decisions during triage.

  • Research Article
  • 10.2196/74196
Explainable AI for Predicting Mortality Risk in Metastatic Cancer: Retrospective Cohort Study Using the Memorial Sloan Kettering-Metastatic Dataset.
  • Jan 13, 2026
  • JMIR cancer
  • Polycarp Nalela + 2 more

Metastatic cancer remains one of the leading causes of cancer-related mortality worldwide. Yet, the prediction of survivability in this population remains limited by heterogeneous clinical presentations and high-dimensional molecular features. Advances in machine learning (ML) provide an opportunity to integrate diverse patient- and tumor-level factors into explainable predictive ML models. Leveraging large real-world datasets and modern ML techniques can enable improved risk stratification and precision oncology. This study aimed to develop and interpret ML models for predicting overall survival in patients with metastatic cancer using the Memorial Sloan Kettering-Metastatic (MSK-MET) dataset and to identify key prognostic biomarkers through explainable artificial intelligence techniques. We performed a retrospective analysis of the MSK-MET cohort, comprising 25,775 patients across 27 tumor types. After data cleaning and balancing, 20,338 patients were included. Overall survival was defined as deceased versus living at last follow-up. Five classifiers (extreme gradient boosting [XGBoost], logistic regression, random forest, decision tree, and naive Bayes) were trained using an 80/20 stratified split and optimized via grid search with 5-fold cross-validation. Model performance was assessed using accuracy, area under the curve (AUC), precision, recall, and F1-score. Model explainability was achieved using Shapley additive explanations (SHAP). Survival analyses included Kaplan-Meier estimates, Cox proportional hazards models, and an XGBoost-Cox model for time-to-event prediction. The positive predictive value and negative predictive value were calculated at the Youden index-optimal threshold. XGBoost achieved the highest performance (accuracy=0.74; AUC=0.82), outperforming other classifiers. In survival analyses, the XGBoost-Cox model with a concordance index (C-index) of 0.70 exceeded the traditional Cox model (C-index=0.66). SHAP analysis and Cox models consistently identified metastatic site count, tumor mutational burden, fraction of genome altered, and the presence of distant liver and bone metastases as among the strongest prognostic factors, a pattern that held at both the pan-cancer level and recurrently across cancer-specific models. At the cancer-specific level, performance varied; prostate cancer achieved the highest predictive accuracy (AUC=0.88), while pancreatic cancer was notably more challenging (AUC=0.68). Kaplan-Meier analyses demonstrated marked survival separation between patients with and without metastases (80-month survival: approximately 0.80 vs 0.30). At the Youden-optimal threshold, positive predictive value and negative predictive value were approximately 70% and 80%, respectively, supporting clinical use for risk stratification. Explainable ML models, particularly XGBoost combined with SHAP, can strongly predict survivability in metastatic cancers while highlighting clinically meaningful features. These findings support the use of ML-based tools for patient counseling, treatment planning, and integration into precision oncology workflows. Future work should include external validation on independent cohorts, integration with electronic health records via Fast Healthcare Interoperability Resources-based dashboards, and prospective clinician-in-the-loop evaluation to assess real-world use.

  • Research Article
  • Cite Count Icon 6
  • 10.2196/52837
Personalized Prediction of Long-Term Renal Function Prognosis Following Nephrectomy Using Interpretable Machine Learning Algorithms: Case-Control Study
  • Sep 20, 2024
  • JMIR Medical Informatics
  • Lingyu Xu + 12 more

BackgroundAcute kidney injury (AKI) is a common adverse outcome following nephrectomy. The progression from AKI to acute kidney disease (AKD) and subsequently to chronic kidney disease (CKD) remains a concern; yet, the predictive mechanisms for these transitions are not fully understood. Interpretable machine learning (ML) models offer insights into how clinical features influence long-term renal function outcomes after nephrectomy, providing a more precise framework for identifying patients at risk and supporting improved clinical decision-making processes.ObjectiveThis study aimed to (1) evaluate postnephrectomy rates of AKI, AKD, and CKD, analyzing long-term renal outcomes along different trajectories; (2) interpret AKD and CKD models using Shapley Additive Explanations values and Local Interpretable Model-Agnostic Explanations algorithm; and (3) develop a web-based tool for estimating AKD or CKD risk after nephrectomy.MethodsWe conducted a retrospective cohort study involving patients who underwent nephrectomy between July 2012 and June 2019. Patient data were randomly split into training, validation, and test sets, maintaining a ratio of 76.5:8.5:15. Eight ML algorithms were used to construct predictive models for postoperative AKD and CKD. The performance of the best-performing models was assessed using various metrics. We used various Shapley Additive Explanations plots and Local Interpretable Model-Agnostic Explanations bar plots to interpret the model and generated directed acyclic graphs to explore the potential causal relationships between features. Additionally, we developed a web-based prediction tool using the top 10 features for AKD prediction and the top 5 features for CKD prediction.ResultsThe study cohort comprised 1559 patients. Incidence rates for AKI, AKD, and CKD were 21.7% (n=330), 15.3% (n=238), and 10.6% (n=165), respectively. Among the evaluated ML models, the Light Gradient-Boosting Machine (LightGBM) model demonstrated superior performance, with an area under the receiver operating characteristic curve of 0.97 for AKD prediction and 0.96 for CKD prediction. Performance metrics and plots highlighted the model’s competence in discrimination, calibration, and clinical applicability. Operative duration, hemoglobin, blood loss, urine protein, and hematocrit were identified as the top 5 features associated with predicted AKD. Baseline estimated glomerular filtration rate, pathology, trajectories of renal function, age, and total bilirubin were the top 5 features associated with predicted CKD. Additionally, we developed a web application using the LightGBM model to estimate AKD and CKD risks.ConclusionsAn interpretable ML model effectively elucidated its decision-making process in identifying patients at risk of AKD and CKD following nephrectomy by enumerating critical features. The web-based calculator, found on the LightGBM model, can assist in formulating more personalized and evidence-based clinical strategies.

  • Preprint Article
  • 10.2196/preprints.52837
Personalized Prediction of Long-Term Renal Function Prognosis Following Nephrectomy Using Interpretable Machine Learning Algorithms: Case-Control Study (Preprint)
  • Sep 18, 2023
  • Lingyu Xu + 12 more

BACKGROUND Acute kidney injury (AKI) is a common adverse outcome following nephrectomy. The progression from AKI to acute kidney disease (AKD) and subsequently to chronic kidney disease (CKD) remains a concern; yet, the predictive mechanisms for these transitions are not fully understood. Interpretable machine learning (ML) models offer insights into how clinical features influence long-term renal function outcomes after nephrectomy, providing a more precise framework for identifying patients at risk and supporting improved clinical decision-making processes. OBJECTIVE This study aimed to (1) evaluate postnephrectomy rates of AKI, AKD, and CKD, analyzing long-term renal outcomes along different trajectories; (2) interpret AKD and CKD models using Shapley Additive Explanations values and Local Interpretable Model-Agnostic Explanations algorithm; and (3) develop a web-based tool for estimating AKD or CKD risk after nephrectomy. METHODS We conducted a retrospective cohort study involving patients who underwent nephrectomy between July 2012 and June 2019. Patient data were randomly split into training, validation, and test sets, maintaining a ratio of 76.5:8.5:15. Eight ML algorithms were used to construct predictive models for postoperative AKD and CKD. The performance of the best-performing models was assessed using various metrics. We used various Shapley Additive Explanations plots and Local Interpretable Model-Agnostic Explanations bar plots to interpret the model and generated directed acyclic graphs to explore the potential causal relationships between features. Additionally, we developed a web-based prediction tool using the top 10 features for AKD prediction and the top 5 features for CKD prediction. RESULTS The study cohort comprised 1559 patients. Incidence rates for AKI, AKD, and CKD were 21.7% (n=330), 15.3% (n=238), and 10.6% (n=165), respectively. Among the evaluated ML models, the Light Gradient-Boosting Machine (LightGBM) model demonstrated superior performance, with an area under the receiver operating characteristic curve of 0.97 for AKD prediction and 0.96 for CKD prediction. Performance metrics and plots highlighted the model’s competence in discrimination, calibration, and clinical applicability. Operative duration, hemoglobin, blood loss, urine protein, and hematocrit were identified as the top 5 features associated with predicted AKD. Baseline estimated glomerular filtration rate, pathology, trajectories of renal function, age, and total bilirubin were the top 5 features associated with predicted CKD. Additionally, we developed a web application using the LightGBM model to estimate AKD and CKD risks. CONCLUSIONS An interpretable ML model effectively elucidated its decision-making process in identifying patients at risk of AKD and CKD following nephrectomy by enumerating critical features. The web-based calculator, found on the LightGBM model, can assist in formulating more personalized and evidence-based clinical strategies.

  • Research Article
  • Cite Count Icon 18
  • 10.1108/ijhma-11-2022-0172
Predictability of Belgian residential real estate rents using tree-based ML models and IML techniques
  • Apr 13, 2023
  • International Journal of Housing Markets and Analysis
  • Ian Lenaers + 2 more

PurposeThe purpose is twofold. First, this study aims to establish that black box tree-based machine learning (ML) models have better predictive performance than a standard linear regression (LR) hedonic model for rent prediction. Second, it shows the added value of analyzing tree-based ML models with interpretable machine learning (IML) techniques.Design/methodology/approachData on Belgian residential rental properties were collected. Tree-based ML models, random forest regression and eXtreme gradient boosting regression were applied to derive rent prediction models to compare predictive performance with a LR model. Interpretations of the tree-based models regarding important factors in predicting rent were made using SHapley Additive exPlanations (SHAP) feature importance (FI) plots and SHAP summary plots.FindingsResults indicate that tree-based models perform better than a LR model for Belgian residential rent prediction. The SHAP FI plots agree that asking price, cadastral income, surface livable, number of bedrooms, number of bathrooms and variables measuring the proximity to points of interest are dominant predictors. The direction of relationships between rent and its factors is determined with SHAP summary plots. In addition to linear relationships, it emerges that nonlinear relationships exist.Originality/valueRent prediction using ML is relatively less studied than house price prediction. In addition, studying prediction models using IML techniques is relatively new in real estate economics. Moreover, to the best of the authors’ knowledge, this study is the first to derive insights of driving determinants of predicted rents from SHAP FI and SHAP summary plots.

  • Research Article
  • Cite Count Icon 5
  • 10.3390/medicina61010016
The Potential of SHAP and Machine Learning for Personalized Explanations of Influencing Factors in Myopic Treatment for Children.
  • Dec 26, 2024
  • Medicina (Kaunas, Lithuania)
  • Jun-Wei Chen + 4 more

Background and Objectives: The rising prevalence of myopia is a significant global health concern. Atropine eye drops are commonly used to slow myopia progression in children, but their long-term use raises concern about intraocular pressure (IOP). This study uses SHapley Additive exPlanations (SHAP) to improve the interpretability of machine learning (ML) model predicting end IOP, offering clinicians explainable insights for personalized patient management. Materials and Methods: This retrospective study analyzed data from 1191 individual eyes of 639 boys and 552 girls with myopia treated with atropine. The average age of the whole group was 10.6 ± 2.5 years old. The refractive error of spherical equivalent (SE) in myopia degree was base SE at 2.63D and end SE at 3.12D. Data were collected from clinical records, including demographic information, IOP measurements, and atropine treatment details. The patients were divided into two subgroups based on a baseline IOP of 14 mmHg. ML models, including Lasso, CART, XGB, and RF, were developed to predict the end IOP value. Then, the best-performing model was further interpreted using SHAP values. The SHAP module created a personalized and dynamic graphic to illustrate how various factors (e.g., age, sex, cumulative duration, and dosage of atropine treatment) affect the end IOP. Results: RF showed the best performance, with superior error metrics in both subgroups. The interpretation of RF with SHAP revealed that age and the recruitment duration of atropine consistently influenced IOP across subgroups, while other variables had varying effects. SHAP values also offer insights, helping clinicians understand how different factors contribute to predicted IOP value in individual children. Conclusions: SHAP provides an alternative approach to understand the factors affecting IOP in children with myopia treated with atropine. Its enhanced interpretability helps clinicians make informed decisions, improving the safety and efficacy of myopia management. This study demonstrates the potential of combining SHAP with ML models for personalized care in ophthalmology.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 38
  • 10.1038/s41598-023-46930-2
Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel
  • Nov 10, 2023
  • Scientific Reports
  • Andrea Mastropietro + 2 more

Machine learning (ML) algorithms are extensively used in pharmaceutical research. Most ML models have black-box character, thus preventing the interpretation of predictions. However, rationalizing model decisions is of critical importance if predictions should aid in experimental design. Accordingly, in interdisciplinary research, there is growing interest in explaining ML models. Methods devised for this purpose are a part of the explainable artificial intelligence (XAI) spectrum of approaches. In XAI, the Shapley value concept originating from cooperative game theory has become popular for identifying features determining predictions. The Shapley value concept has been adapted as a model-agnostic approach for explaining predictions. Since the computational time required for Shapley value calculations scales exponentially with the number of features used, local approximations such as Shapley additive explanations (SHAP) are usually required in ML. The support vector machine (SVM) algorithm is one of the most popular ML methods in pharmaceutical research and beyond. SVM models are often explained using SHAP. However, there is only limited correlation between SHAP and exact Shapley values, as previously demonstrated for SVM calculations using the Tanimoto kernel, which limits SVM model explanation. Since the Tanimoto kernel is a special kernel function mostly applied for assessing chemical similarity, we have developed the Shapley value-expressed radial basis function (SVERAD), a computationally efficient approach for the calculation of exact Shapley values for SVM models based upon radial basis function kernels that are widely applied in different areas. SVERAD is shown to produce meaningful explanations of SVM predictions.

  • Research Article
  • 10.1111/joor.70108
An Interpretable Machine Learning Model Based on MRI Features for Predicting Pain Severity in Temporomandibular Disorders.
  • Nov 18, 2025
  • Journal of oral rehabilitation
  • Chuanfang Xu + 6 more

Chronic pain around the temporomandibular joint (TMJ) and masticatory muscles is a primary symptom of temporomandibular disorders (TMD). However, the clinical significance of magnetic resonance imaging (MRI) features in predicting TMD-related pain remains unclear. This study aimed to develop and interpret machine learning (ML) models based on MRI characteristics for predicting pain severity in patients with TMD. The present retrospective study included 584 patients with TMD between January 2022 and December 2024, yielding a total of 755 TMJ MRI data sets. Pain severity was classified using the visual analogue scale (VAS). Demographic variables (age, sex) and MRI features-including lesion side, disc position, disc morphology, disc signal, disc perforation, bilaminar zone tear, joint space, joint effusion, condylar movement, bony changes and morphology/signal of the lateral pterygoid muscle-were collected. Eleven ML models based on demographic and MRI features were developed: logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), adaptive boosting (AdaBoost), gradient boosting classifier (GBC), bagging classifier (BC), extremely randomised trees (ETC), decision tree classifier (DTC) and multilayer perceptron (MLP). Model performance was evaluated using multiple metrics, including the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity and F1 score. Precision-recall (PR) curves and calibration curves were plotted to assess discrimination and model calibration. Decision curve analysis (DCA) was conducted to evaluate the clinical net benefit across a range of threshold probabilities. Model interpretability was enhanced using Shapley Additive Explanations (SHAP), which quantified the contribution of each feature to individual predictions. Feature selection was conducted based on mean SHAP values, and separate LightGBM models were constructed using the Top 3, 5, and 9 most important features, as well as the full-feature set, for performance comparison. The data set was randomly divided into a training set (n = 604) and a test set (n = 151). Among the 11 ML models, the LightGBM model demonstrated the best predictive performance, with an AUC of 0.899, and was therefore identified as the optimal model. SHAP analysis identified age, disc position and condylar movement as the top three contributing features. Feature selection analysis indicated that selecting the top nine SHAP-ranked variables led to the highest diagnostic performance, with an AUC of 0.829. This study developed an interpretable, high-performing MRI-based ML model incorporating SHAP analysis to integrate imaging and clinical features for objective pain assessment, which may help identify high-risk TMD patients and guide personalised treatment strategies.

  • Research Article
  • Cite Count Icon 75
  • 10.1016/j.aap.2022.106617
On the interpretability of machine learning methods in crash frequency modeling and crash modification factor development
  • Feb 21, 2022
  • Accident Analysis &amp; Prevention
  • Xiao Wen + 4 more

On the interpretability of machine learning methods in crash frequency modeling and crash modification factor development

  • Research Article
  • 10.1111/nicc.70469
Machine Learning Interpretability to Assess the Association Between Time in Tight Range and Mortality in Cardiogenic Shock.
  • May 1, 2026
  • Nursing in critical care
  • Yang Jiang + 6 more

Cardiogenic shock (CS) is a critical condition of end-organ hypoperfusion with high mortality. Fluctuations in blood glucose (BG) levels may exacerbate cardiovascular instability in critically ill patients. Time In Tight Range (TITR), defined as the percentage of time in the target BG range of 3.9-7.8 mmol/L (70-140 mg/dL), has become an increasingly important index of glycaemic status, but its impact on mortality in CS remains unclear. Interpretable machine learning (ML) models provide transparent, quantitative and visual insights into the prognostic importance of TITR, clarifying its pivotal role in outcome prediction and providing objective evidence to support individualised glucose management. This study aimed to investigate the relationship between TITR and mortality in patients with CS, and to provide strong evidence for early intervention and personalised blood glucose management. We conducted a retrospective multi-cohort study to examine the association between TITR and mortality in patients with CS. The relationship between TITR and in-hospital mortality was analysed using a restricted cubic spline (RCS) model, log-rank test, multivariable Cox and logistic regression analyses. ML models, including XGBoost, LightGBM, CatBoost, Gradient Boosting, Support Vector Machine (SVM), Neural Network and Naive Bayes, were developed to predict mortality and compared with traditional clinical scoring systems. Model interpretability was assessed using SHapley Additive exPlanations (SHAP). Sensitivity and subgroup analyses were used to reveal the robustness of the results. RCS analysis revealed an inverse (L-shaped) association (p < 0.001) between TITR and in-hospital mortality in both the Medical Information Mart for Intensive Care IV (MIMIC-IV) and the eICU Collaborative Research Database (eICU) cohorts. Kaplan-Meier survival analyses revealed the patients with TITR > 57% (High TITR group) had significantly lower in-hospital mortality than those with TITR ≤ 57% (Low TITR group) in both cohorts. The hazard ratios (HRs) (95% confidence interval [CI]) estimated by log-rank test were 1.72 (1.49, 1.99) and 1.49 (1.19, 1.87) in the MIMIC-IV and eICU cohorts, respectively (both p < 0.001). Sensitivity analyses yielded consistent results, confirming the robustness of the findings. In addition, analyses of ICU mortality, 28-day mortality (only available in the MIMIC-IV cohort), also demonstrated a consistent pattern with the primary outcome. Based on area under curve values (AUC), ML models, including CatBoost (AUC = 0.76; 95% CI: 0.73-0.80 in MIMIC-IV; 0.77; 95% CI: 0.72-0.83 in eICU), Gradient Boosting (AUC = 0.75; 95% CI: 0.72-0.79 and 0.74; 95% CI: 0.68-0.80) and XGBoost (AUC = 0.74; 95% CI: 0.70-0.77 and 0.76; 95% CI: 0.71-0.82), outperformed traditional scoring systems. Interpretability analysis via SHAP consistently highlighted TITR as the most influential factor in mortality prediction. These findings underscore the critical role of TITR in outcome prediction and demonstrate both the superiority and interpretability of ML models for risk stratification and decision support. Achieving higher TITR is associated with improved outcomes, highlighting the importance of dynamic glucose control in this population. The findings offer evidence-based guidance for ICU nursing interventions, highlighting TITR as a key modifiable factor for improving outcomes for patients with CS. The strong performance of ML models supports their clinical application for more accurate risk stratification, while identification of TITR as the primary mortality predictor-reinforced by transparent SHAP explanations-provides actionable insights to guide early, targeted interventions.

  • Research Article
  • 10.1016/j.acra.2025.04.068
Right Ventricular Strain as a Key Feature in Interpretable Machine Learning for Identification of Takotsubo Syndrome: A Multicenter CMR-based Study.
  • May 1, 2025
  • Academic radiology
  • Zeliu Du + 17 more

Right Ventricular Strain as a Key Feature in Interpretable Machine Learning for Identification of Takotsubo Syndrome: A Multicenter CMR-based Study.

  • Research Article
  • 10.1149/ma2023-02653153mtgabs
Nuclear Magnetic Resonance Chemical Shift As Highly Explainable Chemical Structure Fingerprints for Anion Exchange Membrane Polymers
  • Dec 22, 2023
  • Electrochemical Society Meeting Abstracts
  • Yin Kan Phua + 2 more

Recent shift towards clean energy increased the demand for both fuel cells (used as clean power generator) and water electrolyzer (used as hydrogen supply) significantly. Anion exchange membrane (AEM) serves as a core component for these devices, though its low anion conductivity and durability inhibit their potential for commercialization. Many research and development (R&D) have been done seeking for improvements in AEM1, but current empirical-centric method consumes significant amount of resources, such as cost, labor, and time. To reduce resource consumptions, implementing materials informatics (MI) that allows high-speed screening of materials through a pre-trained AEM polymer machine learning (ML) model is important. However, AEMs are made up of polymers whose chemical structures are complex and hard to represent in a machine-understandable form. Fingerprints are often used to represent chemical structures in numerical forms generated through algorithm2. Majority of these fingerprints are designed for small molecules, not polymers, but they are usually unintuitive and difficult to understand due to their topological nature. In contrast, nuclear magnetic resonance (NMR) chemical shift have long been used in chemistry to identify the chemical structure of a particular sample3. In this study, we aim to utilize the high-resolution nature of NMR chemical shift to identify structural formula as chemical structure fingerprint for ML model, such that a polymer-suited and highly explainable fingerprint can be developed.First, an AEM database containing structural and experimental condition information was built using data extracted from 62 papers. Experimental conditions included were anion conductivity measuring temperature, alkaline stability test measuring condition (test temperature, length of test, and concentration of alkaline solution). Then, the 13C NMR chemical shift for the chemical structure contained in structural information was calculated using ChemDraw. The obtained chemical shifts were converted to numerical strings and is named as “NMR fingerprint”. A new AEM database containing both structural (molar ratio of each building blocks and NMR fingerprints) and experimental condition was used as the training database for ML models. Target variable was set to anion conductivity, and the rest were explanatory variables. ML model used was XGBoost. Cross-validation was used to evaluate the capability of ML models to predict anion conductivity of novel AEM polymers. Prediction logic was analyzed using Shapley additive explanations (SHAP) value.The database built contains data from 62 AEM papers, with 2,197 anion conductivity data points. Each AEM chemical structures present in the database was converted to NMR fingerprints using NMR chemical shifts, obtaining around 2,000 NMR fingerprints for each AEM polymer unit. Together with the experimental conditions and structural information included in the database, the data were used as train-validation dataset for XGBoost. The coefficient of determination (R2) obtained for cross-validated model was 0.9235, implying that the model learnt and determined the relationship between anion conductivity and AEM polymer structure with high accuracy, with the aid of experimental conditions. Then, the prediction logic of the ML model was explored using SHAP values, which are values computed from coalitional game theory, and is used to increase transparency and interpretability of ML models. Analyzing the plot of SHAP values for top 20 important variables used in XGBoost showed that measuring temperature for anion conductivity ranked highest, which is in coherence to the well-known behavior of AEM polymers. Besides, non-experimental condition variables such as 29.8_A ranked into the top 3 important variables. 29.8_A is the chemical shift for alkyl groups attached in between two imidazolium group of AEM polymer, suggesting that the presence or absence of more than one imidazolium group per side chain is important to determine the anion conductivity of an AEM polymer. SHAP values for 29.8_A show that higher feature value (pink color) gives higher impact (positive region of x-axis) to the target variable, inferring that having alkyl groups between imidazolium groups give beneficial effect to anion conductivity. Such ability to explain the prediction logic of ML model shows that using NMR chemical shifts as fingerprints for AEM polymer structures provide intuitive, human-understandable ML prediction logic explanation. Together with the high cross-validation accuracy, NMR chemical shifts hold the potential to not only be a gold standard in expressing polymer structures in machine-understandable form, but also to strongly push the adoption of ML in the AEM polymer field, creating a paradigm shift for AEM R&D.Reference S. Gottesfeld et al., J. Power Sources 2018, 375, 170-184.J. Bajorath, J. Chem. Inf. Comput. Sci. 2001, 41, 2, 233–245.P. Jezzard et al., Adv. Mater. 1992, 4, 2, 82-90. Figure 1

  • Research Article
  • Cite Count Icon 3
  • 10.2106/jbjs.oa.24.00213
Development of Explainable Machine Learning Models to Identify Patients at Risk for 1-Year Mortality and New Distant Metastases Postendoprosthetic Reconstruction for Lower Extremity Bone Tumors: A Secondary Analysis of the PARITY Trial.
  • Apr 1, 2025
  • JB & JS open access
  • Jiawen Deng + 6 more

Accurate prediction of postoperative metastasis and mortality risks in patients undergoing lower-limb oncological resection and endoprosthetic reconstruction is essential for guiding adjuvant therapies and managing patient expectations. Current prediction methods are limited by variability in patient-specific factors. This study aims to develop and internally validate explainable machine learning (ML) models to predict the 1-year risk of new distant metastases and mortality in these patients. We performed a secondary analysis of data from the Prophylactic Antibiotic Regimens in Tumor Surgery trial, which included 604 patients. Candidate features were selected based on availability and clinical relevance and then narrowed using Least Absolute Shrinkage and Selection Operator (LASSO) regression and Boruta algorithms. Six ML classification algorithms were tuned and calibrated: logistic regression, support vector machines, random forest, Light Gradient Boosting Machine (LightGBM), eXtreme Gradient Boosting (XGBoost), and neural networks. Models were developed with and without including percent tumor necrosis due to its high missing data rate (>30%). Hyperparameters were tuned using Bayesian optimization. Internal validation was conducted using a 30% hold-out set. Model explainability was assessed using permutation-based feature importance and SHapley Additive exPlanations. LightGBM was identified as the best-performing algorithm for both outcomes. For 1-year mortality prediction without percent necrosis, LightGBM achieved an area under the receiver operating characteristic curve (AUC-ROC) of 0.78 (95% confidence interval [CI] 0.70-0.86) during cross-validation and 0.72 on internal validation. For distant metastasis prediction, the LightGBM model without percent necrosis achieved an AUC-ROC of 0.77 (95% CI 0.71-0.84) during cross-validation and 0.77 on internal validation. Including percent necrosis did not significantly improve model performance. The top predictors identified were patient age, largest tumor dimension, and tumor stage. Explainable ML models can effectively predict the 1-year risk of mortality and new distant metastases in patients undergoing lower-limb oncological resection and endoprosthetic reconstruction. Further external validation and consideration of other data modalities are required before integrating these ML-driven risk assessments into routine clinical practice. Level II, Prognostic Study. See Instructions for Authors for a complete description of levels of evidence.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant