• All Solutions All Solutions Caret
    • Editage

      One platform for all researcher needs

    • Paperpal

      AI-powered academic writing assistant

    • R Discovery

      Your #1 AI companion for literature search

    • Mind the Graph

      AI tool for graphics, illustrations, and artwork

    • Journal finder

      AI-powered journal recommender

    Unlock unlimited use of all AI tools with the Editage Plus membership.

    Explore Editage Plus
  • Support All Solutions Support
    discovery@researcher.life
Discovery Logo
Sign In
Paper
Search Paper
Cancel
Pricing Sign In
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
Discovery Logo menuClose menu
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link

Articles published on SHapley Additive exPlanations

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
7845 Search results
Sort by
Recency
  • New
  • Research Article
  • 10.1038/s41598-025-33980-x
A machine learning-based prediction model for poor prognosis in sepsis using lymphocyte count: a national, multicenter prospective cohort
  • Jan 22, 2026
  • Scientific Reports
  • Siang Huang + 6 more

Abstract Sepsis-induced immunosuppression leads to poor prognosis. Circulating lymphocyte count (LC), as an easily accessible clinical marker, closely reflects the immune status of sepsis. The study aims to perform immune phenotyping of sepsis patients using dynamic LC for early identification of high-risk individuals. A latent class trajectory model (LCTM) was used to analyze the dynamic trajectories of lymphocyte count (LC) based on repeated measurements obtained within at least two measurements of lymphocyte count (LC) within the first 24 h after sepsis diagnosis, followed by two more between day 2 and day 7. Survival differences among subphenotypes were assessed using Kaplan–Meier curves and Cox regression. Feature selection was conducted via the Boruta algorithm, and a high-precision machine learning model was developed to predict the target trajectory. Model interpretability was ensured through SHapley Additive exPlanations (SHAP). The predictive performance of the model for ICU mortality was assessed using the receiver operating characteristic (ROC) curve. The derivation cohort included 2085 sepsis patients from the China Multicenter Sepsis database, and the external validation cohort of 1299 sepsis patients. We identified four trajectory patterns of LC dynamics, among which the persistent lymphopenia (PL) subgroup exhibited the highest disease severity and poorest prognosis. The trajectory model demonstrated consistent patterns in external validation. Six machine learning models were utilized to determine the best model to identify the PL subphenotype, and an online prediction tool was developed for clinical application. Incorporating the PL trajectory subphenotype significantly improved the predictive performance for ICU mortality. Dynamic LC trajectories effectively capture immunological heterogeneity in sepsis, encompassing immunocompromised and immunocompetent hosts. These findings underscore the importance of early identification of patients with persistent lymphopenia to better target populations for future sepsis immunotherapy.

  • New
  • Research Article
  • 10.1113/ep093356
Cortical dynamics of cold exposure and thermal recovery: Evidence from EEG-based spatiotemporal analysis.
  • Jan 22, 2026
  • Experimental physiology
  • Qing Zhang + 7 more

Human thermal perception involves complex and dynamic interactions between peripheral input and central neural regulation. However, the spatial and temporal characteristics of brain responses to different cold exposure scenarios remain poorly understood. In this study, we combined traditional analysis with AI-based anomaly detection to examine electroencephalographic (EEG) responses across five stages of cold exposure in 20 healthy participants, including baseline, cold exposure, wind stimulation, adaptation and recovery. Alpha-band power analysis revealed 14 EEG channels with significant stage-dependent differences, primarily located in the right hemisphere across frontal, central and parietal regions. Shapley additive explanations (SHAP)-based feature importance scores further validated stage-specific channels, identifying F8, T8 and CP6 for cold exposure, T7 for wind stimulation, T8 for adaptation, and F8 and CP6 for recovery. Time-frequency analysis revealed early spectral responses within 1s for cold exposure and recovery, and within 2s for wind stimulation, while AI anomaly detection estimated later latencies of 2.201∼2.735s, highlighting the distinct sensitivities of each method. These results reveal right-lateralized, stage-specific brain activations, and demonstrate the complementary value of traditional and AI methods in decoding thermal responses.

  • New
  • Research Article
  • 10.1088/1361-6560/ae3658
Radiological and biological dictionary of radiomics features: addressing understandable AI issues in personalized breast cancer; dictionary version BM1.0
  • Jan 22, 2026
  • Physics in Medicine & Biology
  • Arman Gorji + 6 more

Objective.Radiomics-based artificial intelligence (AI) models show potential in breast cancer diagnosis but lack interpretability. This study bridges the gap between radiomic features (RFs) and Breast Imaging Reporting and Data System (BI-RADS) descriptors through a clinically interpretable framework.Methods. We developed a dual-dictionary approach. First, a clinical mapping dictionary (CMD) was constructed by mapping 56 RFs to BI-RADS descriptors (shape, margin, internal enhancement (IE)) based on literature and expert review. Second, we applied this framework to a classification task to predict triple-negative (TNBC) versus non-TNBC subtypes using dynamic contrast-enhanced MRI data from a multi-institutional cohort of 1549 patients. We trained 27 machine learning classifiers with 27 feature selection methods. Using SHapley Additive exPlanations (SHAP), we interpreted the model's predictions and developed a Statistical Mapping Dictionary for 51 RFs, not included in the CMD.Results. The best-performing model (variance inflation factor feature selector + extra trees classifier) achieved an average cross-validation accuracy of 0.83 ± 0.02. Our dual-dictionary approach successfully translated predictive RFs into understandable clinical concepts. For example, higher values of 'Sphericity', corresponding to a round/oval shape, were predictive of TNBC. Similarly, lower values of 'Busyness', indicating more homogeneous IE, were also associated with TNBC, aligning with existing clinical observations. This framework confirmed known imaging biomarkers and identified novel, data-driven quantitative features.Conclusion.This study introduces a novel dual-dictionary framework (BM1.0) that bridges RFs and the BI-RADS clinical lexicon. By enhancing the interpretability and transparency of AI models, the framework supports greater clinical trust and paves the way for integrating RFs into breast cancer diagnosis and personalized care.

  • New
  • Research Article
  • 10.3389/fneur.2026.1734264
An interpretable machine learning model for predicting sepsis risk in ICU patients with non-traumatic subarachnoid hemorrhage: development and validation using the MIMIC-IV database
  • Jan 21, 2026
  • Frontiers in Neurology
  • Shaojie Guo + 9 more

Objective This study aimed to develop and validate a machine learning (ML) prediction model for assessing the risk of sepsis in intensive care unit (ICU) patients with non-traumatic subarachnoid hemorrhage (SAH), thereby providing a reference for the early clinical identification of high risk patients. Methods We conducted a retrospective cohort study using data from the Medical Information Mart for Intensive Care (MIMIC-IV) database, which includes admissions between 2008 and 2022. We extracted demographic information, laboratory parameters, complications, and other clinical data. Patients were randomly divided into a training set and a test set in an 8:2 ratio. Least Absolute Shrinkage and Selection Operator regression was used to identify core predictive features. Fourteen machine learning models were constructed, including Random Forest, Gradient Boosting, Kernel-based SVM, Logistic Regression, K-Nearest Neighbors, Partial Least Squares, Boosting Method, Neural Network, Naive Bayes, Discriminant Analysis, Lasso, XGBoost, CATBoost, and LightGBM. Key evaluation metrics included sensitivity, specificity, accuracy, F1 score, Youden index, and the area under the curve (AUC). SHapley Additive exPlanations (SHAP) analysis was employed to interpret the model’s decision logic, and Decision Curve Analysis (DCA) was used to assess clinical utility. Results A total of 1,052 patients with non-traumatic SAH were enrolled, with 841 assigned to the training set and 211 to the test set. Lasso regression identified 11 core predictive features, including pneumonia, norepinephrine use, mechanical ventilation, Glasgow Coma Scale (GCS) grade, and acute kidney injury (AKI). The CATBoost model demonstrated the best performance: in the training set, it achieved an AUC of 88.9%, sensitivity of 73.2%, specificity of 85.9%, and a Youden index of 0.592; in the test set, it achieved an AUC of 0.887, sensitivity of 75.5%, specificity of 82.3%, and a Youden index of 0.578. Performance fluctuation between the training and test sets was less than 2%, indicating excellent stability. SHAP analysis revealed that pneumonia, norepinephrine use, and mechanical ventilation were the top three features influencing sepsis risk, with pneumonia significantly increasing the risk. DCA results showed that the CATBoost model had the highest net benefit in the high-risk threshold range of 0.2–0.6. Conclusion The machine learning model developed based on the MIMIC-IV database can effectively predict the risk of sepsis in ICU patients with non-traumatic SAH. It demonstrates good interpretability and clinical utility, providing a basis for clinical risk stratification and precise intervention.

  • New
  • Research Article
  • 10.1186/s40001-026-03846-7
Development and validation of an inflammatory index-based interpretable machine learning model for mortality risk stratification in hemodialysis patients.
  • Jan 21, 2026
  • European journal of medical research
  • Zhenhua Yang + 4 more

Hemodialysis patients with end-stage renal disease have high all-cause mortality, with chronic low-grade inflammation as a key prognostic factor. Existing mortality prediction models lack both accuracy and clinical interpretability, and no studies have systematically integrated multiple inflammatory indices (neutrophil-to-lymphocyte ratio, monocyte-to-lymphocyte ratio, etc.) into interpretable tools for this population. A single-center retrospective cohort study included 512 hemodialysis patients (Jan 2021-Oct 2024) from The Central Hospital of Wuhan, split into training (70%), validation (15%), and test (15%) sets per TRIPOD guidelines. Fifteen baseline clinical variables and five inflammatory indices were collected. Missing data were imputed, data normalized, and oversampling used to address imbalance. Twelve models (9 traditional machine learning, 1 neural network, 2 ensembles) were built, optimized via tenfold cross validation, and interpreted with SHapley Additive exPlanations. At follow-up (Oct 30, 2024), 212 (41.4%) patients died. Non-survivors differed significantly from survivors in myocardial infarction (16.0% vs. 2.7%, p < 0.001), neutrophil-to-lymphocyte ratio (median: 4.1 vs. 3.6, p = 0.012), dialysis vintage (42.5 vs. 77.5months, p < 0.001), and age (62.2 ± 12.3 vs. 58.1 ± 13.3years, p < 0.001). The Stacking model performed best (AUC = 0.983, accuracy = 0.922), outperforming logistic regression (AUC = 0.703). Body mass index, myocardial infarction history, and neutrophil-to-lymphocyte ratio were top predictors. The interpretable stacking model enables accurate mortality risk stratification for hemodialysis patients. Future multi-center validation and multi-modal data integration will enhance its generalizability for clinical application.

  • New
  • Research Article
  • 10.1038/s41598-026-36923-2
An online interpretable machine learning model for predicting cardiometabolic multimorbidity risk in patients with type 2 diabetes mellitus.
  • Jan 21, 2026
  • Scientific reports
  • Xiaohan Liu + 6 more

Cardiometabolic multimorbidity (CMM), a major complication in type 2 diabetes mellitus (T2DM), increases mortality and healthcare burden. Early identification of high-risk individuals is crucial for precision intervention. This study aimed to develop and validate an online interpretable machine learning system for forecasting the CMM risk in T2DM populations to facilitate personalized decision-making and early intervention. We used data from 793 T2DM patients from a tertiary hospital in Shanxi Province as the derivation cohort, divided into training (80%) and internal validation (20%) sets, with 360 cases from another independent center for external validation. Feature selection was performed through recursive feature elimination with random forest algorithm. We employed six machine learning algorithms to develop the CMM risk model. Model performance was evaluated using accuracy, precision, recall, F1-score, and area under the curve (AUC). The SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) provided model interpretability. After feature screening, nine predictors were included in the model. In internal validation, the Stacking model achieved the highest AUC (0.868), maintaining good external validation performance with an AUC of 0.822. The web-based system was accessible on https://t2dmcmmpredictionweb.streamlit.app/. This system assisted healthcare providers to identify high-risk populations early and facilitate timely intervention to mitigate disease progression.

  • New
  • Research Article
  • 10.1002/gj.70175
Interpretable Landslide Hazard Analysis in the Western Ghats of Maharashtra, India: A Hybrid Machine Learning and Statistical Approach
  • Jan 21, 2026
  • Geological Journal
  • Nilesh Suresh Pawar + 1 more

ABSTRACT Landslide hazard mapping is critical for disaster risk reduction and resilient planning in vulnerable regions. Whilst the existing methods for Landslide hazard mapping (LHM) provide good predictability, they still lack in analysing the impact of parameters on the hazard risk. Although Explainable AI (XAI) application enhances this capability, its use in LHM analysis is still limited. This study presents an interpretable framework for evaluating landslide hazards in Raigad district, Maharashtra, that integrates machine learning models, Random Forest (RF) and Support Vector Machine (SVM) with statistical techniques, Frequency Ratio (FR) and Shannon Entropy (SE). Sixteen conditioning factors, selected through multicollinearity screening and feature‐selection methods and spatially validated inventory of 174 datasets of landslides and non‐landslides each, were used for analysis. Model performance was assessed using evaluation metrics. The RF model achieved the highest AUC‐ROC of 0.90, followed by SVM (0.81), SE (0.80), and FR (0.79). To identify critical parameters affecting landslide vulnerability, XAI methods such as SHapley Additive exPlanations (SHAP) and partial dependence plot analysis were applied. Slope angle emerged as the most dominant predictor of landslide risk. The generated hazard maps can offer more useful insights into the categorisation of land based on the degree of landslide risk. These maps can be used in development planning to help with infrastructure design, land‐use zoning, and prioritisation of high‐risk locations for mitigation.

  • New
  • Research Article
  • 10.1002/hkj2.70060
Utilizing machine learning to establish a predictive model for transfusion requirements in patients with severe trauma: A comprehensive analysis
  • Jan 21, 2026
  • Hong Kong Journal of Emergency Medicine
  • Kaiyuan Li + 7 more

Abstract Background Blood transfusion plays a crucial role in the emergency care of trauma patients, significantly impacting their survival rates and prognoses, thereby saving millions of lives annually. Early and rapid recognition of transfusion needs in trauma patients is essential. This study aims to establish a predictive model for emergency transfusions in patients with severe trauma using machine learning (ML). Methods Data were obtained from a comprehensive and anonymized set of medical records. LASSO regression was employed for feature selection. Six ML algorithms were utilized to develop predictive models. The performance of these models was assessed based on their identification accuracy, calibration, and clinical utility. Additionally, the SHapley Additive exPlanations (SHAP) method was applied to visualize model features and predictions on an individual case basis. Results A total of 1716 trauma patients were included in the study and 278 (16.2%) receive blood transfusion after emergency room admission. A model with 11 variables was built, with XGBoost performing best, achieving an area under the curve of 0.884 (95% CI: 0.847–0.921) and brier score of 0.0878 (0.0734–0.1071). Key predictors included, shock index, systolic blood pressure, heart rate, traumatic brain injury, hepatic insufficiency, age, gender, respiratory rate, percutaneous arterial oxygen saturation, pelvic fracture, and femoral fracture. The model also showed robust net benefit across a threshold probability (0.1–0.75). Conclusion We developed a ML model to predict the need of transfusion in trauma patients and conducted a comprehensive assessment of 6 models in terms of discrimination, calibration, and clinical utility. The SHAP method was employed to visually interpret the influence of each variable, thereby enabling clinicians to better understand the underlying mechanisms of ML.

  • New
  • Research Article
  • 10.1080/14786435.2026.2617019
Prediction of impact damage to Fe-based amorphous coatings via interpretable machine learning
  • Jan 21, 2026
  • Philosophical Magazine
  • Hui Guo + 5 more

ABSTRACT Predicting the impact damage of Fe-based amorphous coatings remains a significant challenge due to the complex, nonlinear effects of key coating properties. This study develops an interpretable machine learning (ML) framework to accurately predict impact performance and elucidate the governing physical mechanisms. Twelve regression models were systematically evaluated. The Multilayer Perceptron (MLP) demonstrating superior performance with the highest determination coefficient (R 2) of 0.968 and the lowest error metrics (MSE: 0.066 mm2, RMSE: 0.257 mm, MAE: 0.150 mm) on the testing set. Interpretability analysis via Shapley additive explanations (SHAP) quantified the global importance of input features, revealing the following hierarchy: shear stress depth ratio (γ) > residual stress (σ) > through-porosity rate (p) > maximum shear stress (τ). This ranking provides a critical, data-driven design insight: optimising the through-thickness distribution of shear stress (γ) is more crucial for mitigating interfacial delamination than merely increasing compressive residual stress. Further, an explicit, quantitative formula was derived from the MLP model, enabling rapid prediction of impact damage. This work establishes a robust ML-SHAP paradigm that not only deliver a high-accuracy predictive tool but also translates complex model outputs into actionable physical understanding and design priorities for advanced coating engineering.

  • New
  • Research Article
  • 10.1186/s12891-026-09526-1
Elucidating osteoporosis response signatures in rheumatoid arthritis using explainable machine learning ensembles.
  • Jan 21, 2026
  • BMC musculoskeletal disorders
  • Kaibin Lin + 8 more

Osteoporosis (OP) presents a significant health issue in rheumatoid arthritis (RA) patients, yet existing machine learning (ML) studies on OP prediction in this population are limited by low accuracy, a narrow range of considered risk factors, and a lack of interpretability. This study aims to develop an interpretable machine learning model using the CNN-SVM algorithm, integrated with interpretability techniques, for individualized osteoporosis risk assessment in RA patients. The model specifically focuses on the osteopenia stage, which has been overlooked in previous research, to better capture the different risk factors involved in the progression of osteoporosis in RA patients. We recruited 314 RA patients from the Department of Rheumatology and Immunology. Participants were categorized into osteoporosis, osteopenia, and normal groups based on lumbar spine or hip bone mineral density (BMD) T-scores. We constructed ML model to assess osteoporosis using a novel classification algorithm, CNN-SVM, and employed SHapley Additive exPlanations (SHAP) and Sankey diagram to investigate significant risk factors, rank risk factor contributions, and provide individualized feature contribution explanations. A total of 16 candidate variables were included, and three classification models were constructed to predict osteoporosis versus osteopenia, osteoporosis versus normal, and osteopenia versus normal. The AUC values for the models were 0.83, 0.93, and 0.74, respectively. Feature importance analysis using SHAP identified several key predictors. Factors such as Vitamin D supplements, Synovitis in Both Knees, and gender were crucial for distinguishing normal from osteopenia. For differentiating osteoporosis, Alendronate Sodium, weight, and age consistently ranked as highly influential features across different comparisons. Feature importance analysis was performed, ranking risk factors and providing individualized explanations of feature contributions. The developed interpretable ML model shows promise for screening osteoporosis risk in patients with RA. Its ability to identify individual risk factors highlights its potential to facilitate personalized prevention and management strategies, pending further validation.

  • New
  • Research Article
  • 10.3748/wjg.v32.i3.115527
Application of machine learning models in predicting the risk of thromboembolic events in patients with nonvariceal gastrointestinal bleeding
  • Jan 21, 2026
  • World Journal of Gastroenterology
  • Chao Lu + 10 more

BACKGROUND Clinically, patients with nonvariceal gastrointestinal bleeding (NVGB) are prone to thromboembolic events, but the specific risk remains unclear. AIM To identify risk factors and evaluate the performance of five machine learning (ML) models in predicting the risk of thromboembolic events in patients with NVGB. METHODS This retrospective cohort study enrolled 866 patients from a tertiary hospital for model training and internal validation, and 282 patients from three other tertiary hospitals for external validation. These data were used to develop five ML models to predict the risk of thromboembolic events in patients with NVGB. After initial feature selection by training ML models, ten variables were selected to construct simplified ML models. Model performance was evaluated using accuracy, precision, sensitivity, specificity, F1-score and area under the receiver operating characteristic curve. Calibration curve and decision curve analysis were used to further evaluate the predicted probabilities and net benefits of the models. RESULTS During hospitalization, the incidence of thromboembolic events was 25.61% in patients with NVGB. The categorical boosting (CatBoost) algorithm which combined variable importance and SHapley Additive exPlanations values identified 10 independent predictors of thromboembolic events: (1) History of anticoagulant drug use; (2) D-dimer level; (3) Age; (4) History of thromboembolism; (5) Length of hospital stays; (6) Intensive care unit (ICU) admission; (7) Hemoglobin level; (8) Use of hemostatic drugs; (9) Heart rate; and (10) Serum albumin level. We developed five simplified ML prediction models (L1 regularized logistic regression, random forest, support vector machines, extreme gradient boosting, and CatBoost) based on the above 10 predictors, which achieved area under the receiver operating characteristic curves of 0.805, 0.804, 0.806, 0.746, and 0.815 in external validation, respectively. The performance of all five ML models significantly exceeded that of D-dimer alone in both internal and external validation. The CatBoost model demonstrated good calibration and accuracy, achieving the lowest Brier score of 0.131 and 0.110 in the internal and external validation set, respectively. Of the five models, the CatBoost model was considered the preferred choice in clinical settings. CONCLUSION The findings in this study enable effective and timely preventive interventions for high-risk patients, and help avoid unnecessary monitoring in low-risk patients.

  • New
  • Research Article
  • 10.2196/81048
Machine Learning Prediction of Pharmacogenetic Testing Uptake Among Opioid-Prescribed Patients Using Electronic Health Records: Retrospective Cohort Study
  • Jan 21, 2026
  • JMIR Medical Informatics
  • Mohammad Yaseliani + 8 more

BackgroundOpioids are a widely prescribed class of medication for pain management. However, they have variable efficacy and adverse effects among patients, due to the complex interplay between biological and clinical factors. Pharmacogenetic testing can be used to match patients’ genetic profiles to individualize opioid therapy, improving pain relief and reducing the risk of adverse effects. Despite its potential, the pharmacogenetic testing uptake (use of pharmacogenetic testing) remains low due to a range of barriers at the patient, health care provider, infrastructure, and financial levels. Since testing typically involves a shared decision between the provider and patient, predicting the likelihood of a patient undergoing pharmacogenetic testing and understanding the factors influencing that decision can help optimize resource use and improve outcomes in pain management.ObjectiveThis study aimed to develop machine learning (ML) models, identifying patients’ likelihood of pharmacogenetic uptake based on their demographics, clinical variables, medication use, and social determinants of health.MethodsWe used electronic health record data from a single center health care system to identify patients prescribed opioids. We extracted patients’ demographics, clinical variables, medication use, and social determinants of health, and developed and validated ML models, including a neural network, logistic regression, random forest, extreme gradient boosting (XGB), naïve Bayes, and support vector machines for pharmacogenetic testing uptake prediction based on procedure codes. We performed 5-fold cross-validation and created an ensemble probability-based classifier using the best-performing ML models for pharmacogenetic testing uptake prediction. Various performance metrics, uptake stratification analysis, and feature importance analysis were used to evaluate the performance of the models.ResultsThe ensemble model using XGB and support vector machine–radial basis function classifiers had the highest C-statistics at 79.61%, followed by XGB (78.94%), and neural network (78.05%). While XGB was the best-performing model, the ensemble model achieved a high accuracy (32,699/48,528, 67.38%), recall (537/702, 76.50%), specificity (32,162/47,826, 67.25%), and negative predictive value (32,162/32,327, 99.49%). The uptake stratification analysis using the ensemble model indicated that it can effectively distinguish across uptake probability deciles, where those in the higher strata are more likely to undergo pharmacogenetic testing in the real world (320/4853, 6.59% in the highest decile compared to 6/4853, 0.12% in the lowest). Furthermore, Shapley Additive Explanations value analysis using the XGB model indicated age, hypertension, and household income as the most influential factors for pharmacogenetic testing uptake prediction.ConclusionsThe proposed ensemble model demonstrated a high performance in pharmacogenetic testing uptake prediction among patients using opioids for pain. This model can be used as a decision support tool, assisting clinicians in identifying patients’ likelihood of pharmacogenetic testing uptake and guiding appropriate decision-making.

  • New
  • Research Article
  • 10.3390/cli14010024
Short-Term Heavy Rainfall Potential Identification Driven by Physical Features: Model Development and SHAP-Based Mechanism Interpretation
  • Jan 20, 2026
  • Climate
  • Jingjing An + 6 more

Accurate analysis and forecasting of short-term heavy rainfall (hourly rainfall ≥ 20 mm) are crucial for extending warning, enabling targeted preventive measures, and supporting efficient resource allocation. In recent years, machine learning techniques combined with atmospheric physical variables have offered promising new approaches for analyzing and predicting and forecasting short-term heavy rainfall. However, these methods often lack transparency, which hinders the interpretation of key atmospheric physical variables that drive short-term heavy rainfall and their coupling mechanisms. To address this challenge, the present study integrates the interpretable SHAP (SHapley Additive exPlanations) framework with machine learning to examine potential relationships between widely used atmospheric physical variables and short-term heavy rainfall, thereby improving model interpretability. CatBoost models were constructed based on multiple feature-input strategies using 71 physical variables across five categories derived from ERA5 reanalysis data, and their performance was compared with two benchmark algorithms, XGBoost and LightGBM. The SHAP method was subsequently applied to quantify the contributions of individual features and their interaction effects on model predictions. The results indicate that (1) the CatBoost model, utilizing all 71 physical variables, outperforms other feature combinations, with an AUC of 0.933, and F1 score of 0.930, and a Recall of 0.954, significantly higher than the XGBoost and LightGBM models; (2) Shapley value analysis identified 500 hPa vertical velocity, the A-index, and precipitable water as the most influential features on model performance; (3) The predictive mechanism for short-term heavy rainfall is fundamentally bifurcated: negative instances are classified through the discrete main effects of individual features, whereas positive event detection necessitates a sophisticated coordination of intrinsic main effects and synergistic interactions. Among the feature categories, the horizontal and vertical wind fields, stability and energy indices, and humidity-related variables exhibited the highest contribution ratios, with wind field features demonstrating the strongest interaction effects. The results confirm that integrating atmospheric physical variables with the CatBoost ensemble learning approach significantly improves short-term heavy rainfall identification. Furthermore, incorporating the SHAP interpretability framework provides a theoretical foundation for elucidating the mechanisms of feature influence and optimizing model performance.

  • New
  • Research Article
  • 10.3174/ajnr.a9166
Preoperative Neuroimaging Markers, Clinical Severity Measures, and Shunt Characteristics for Predicting Shunt Revision in Idiopathic Intracranial Hypertension: An Explainable Machine-Learning Study.
  • Jan 20, 2026
  • AJNR. American journal of neuroradiology
  • Seifollah Gholampour + 6 more

Surgical shunt placement is a common treatment for idiopathic intracranial hypertension (IIH) but is hampered by high revision rates. Prior predictive models for shunt revision in IIH have overlooked disease-specific neuroimaging markers. We developed an explainable machine learning model to identify the strongest predictors of shunt revision across neuroimaging markers, clinical severity variables, and shunt-specific factors. The primary objective was to assess the contribution of IIH-related neuroimaging markers within this multimodal predictive framework. In this single-center retrospective cohort study of IIH patients treated from 2001 to 2022, we analyzed 23 variables, including validated neuroradiologic biomarkers, clinical characteristics, and shunt-specific factors. We developed ten machine learning classifiers, which were trained and tuned on 75% of the data using stratified 5-fold cross-validation. Final model performance was validated on an independent, held-out test set comprising the remaining 25% of patients. We then employed SHapley Additive exPlanations for model interpretability and Kaplan-Meier analysis to evaluate time-dependent risk of shunt revision. Among 128 patients (78 with shunt revision, 50 without), a stacked ensemble model (random forest + XGBoost) achieved the best performance on the independent held-out test set (25% of the cohort), with an accuracy of 78.2% (95% confidence interval, 63.1%-90.2%) and an area under the curve of 82.7% (95% confidence interval, 71.5%-92.0%). Model interpretability showed that optic nerve sheath diameter (MRI-derived), papilledema and visual field deficits (ophthalmic clinical and neuro-ophthalmic measures), together with shunt characteristics (nonprogrammable valves, lumboperitoneal shunting, higher initial valve pressure), were the highest contributors to predicted revision risk. Kaplan-Meier analysis showed longer shunt survival with programmable valves and in patients without preoperative visual field deficits, papilledema, or obesity. In this cohort, MRI-derived optic nerve sheath diameter, papilledema, visual field deficits, and shunt characteristics were consistently among the most influential contributors to predicted risk of shunt revision. These findings highlight the added value of MRI-derived markers within a multimodal preoperative assessment, although prospective external validation is required before clinical adoption. SHAP = SHapley Additive exPlanations; ICP = Intracranial Pressure; IIH = Idiopathic Intracranial Hypertension; ML = Machine Learning; ONSD = Optic Nerve Sheath Diameter; LPS = Lumboperitoneal Shunt; XGBoost = Extreme Gradient Boosting.

  • New
  • Research Article
  • 10.1007/s10389-025-02610-1
Spatial modeling and machine learning-based assessment of regional stroke risk and predictors in Ghana: a cross-sectional study
  • Jan 20, 2026
  • Journal of Public Health
  • Abdul-Karim Iddrisu + 1 more

Abstract Aim Stroke remains a leading global cause of death and disability, and its incidence is rising in Ghana, posing a significant public health concern. However, comprehensive data on its spatial distribution across Ghana’s 16 regions are limited. This study aimed to assess the spatial distribution of stroke risk and to identify high-risk regions and associated risk factors. Subject and methods Using nonparametric ensemble machine learning models—random forest and gradient boosting—the study performed variable selection and predicted stroke risk. Key predictors identified were incorporated into a Bayesian spatial model (BYM2) to estimate region-specific relative risk (RR). Posterior estimates were mapped to visualize spatial trends, and interpretability tools such as partial dependence plots and SHAP (SHapley Additive exPlanations) values were used to analyze covariate effects. Results Results showed a modest overall increase in stroke risk (3%), with notable regional variation. The Volta and Central regions exhibited the highest risk (relative risk [RR] = 3.0–3.5 and 2.5–3.0), while the Savannah and Northern regions had the lowest (RR = 0.0–1.0). Gradient boosting outperformed random forest (75% vs. 13% accuracy), identifying gross national income (GNI) and diabetes prevalence as top predictors. Higher GNI was linked to reduced stroke risk (RR = 0.95), whereas increased diabetes prevalence was associated with higher risk (RR = 1.18). Stroke risk decreased sharply at a GNI threshold of 26% and rose steadily with diabetes prevalence. Regions with high GNI and low diabetes prevalence had lower stroke counts. Conclusion The study highlights significant regional disparities and key predictors of stroke risk in Ghana, offering valuable insights for targeted public health strategies and equitable resource allocation.

  • New
  • Research Article
  • 10.3389/fdgth.2025.1752699
Development and independent validation of explainable radiomics-based machine learning models for prognosis in colorectal liver metastases
  • Jan 19, 2026
  • Frontiers in Digital Health
  • A Brunetti + 5 more

Introduction Colorectal cancer frequently leads to liver metastases (CRLM), posing a major challenge to long-term survival. Prognosis remains heterogeneous, and traditional clinical risk scores often lack biological depth and spatial information. Advances in radiomics and machine learning (ML) offer the potential for improved, explainable outcome prediction; however, robust and interpretable prognostic models for CRLM remain an unmet need. This study aimed to develop and validate explainable ML models based on radiomic features extracted from both metastatic lesions and background liver tissue, enhancing the prediction of recurrence and overall survival (OS) status in patients with CRLM. Materials and methods Patient data and contrast-enhanced CT images from two independent cohorts were analysed: a publicly available TCIA-CRLM series, employed as the discovery set, and a real-life clinical cohort, used as an external validation set. Segmentation focused on the largest liver metastasis (L-MAX) and surrounding healthy liver tissue (L-BKG), extracting radiomic features from both areas and their ratios (L-MAX/L-BKG). An end-to-end pipeline for data preprocessing and classification was designed. Multiple ML and Deep Learning (DL) classifiers were trained and validated. Model interpretability was assessed using SHapley Additive exPlanations (SHAP) analysis to identify key predictive radiomic determinants. Performances were compared to recognized clinical models. Results For recurrence prediction, the best-performing classifier was a soft-voting ensemble of a multilayer perceptron (MLP) optimized via a Genetic Algorithm (GA); for OS status classification, the best performance was obtained by a hard-voting ensemble of a GA-optimized MLP. Both classifiers demonstrated robust discrimination capabilities in external validation, with AUCs of 0.78 and 0.68, respectively. The explainability analysis performed with SHAP revealed the most relevant radiomic determinants in the classification. These features retained prognostic significance in the independent cohort, supporting their use for clinical risk stratification. Discussion Explainable ML models leveraging both lesion-centric and contextual liver radiomics offer clinically transparent prediction of recurrence and survival in CRLM. SHAP highlighted clinically plausible, reproducible imaging determinants, enabling risk stratification. The validation of specific radiomic determinants suggests the potential practical utility of this approach, laying out the groundwork for integrating with DL and multi-omic data in future oncology strategies.

  • New
  • Research Article
  • 10.1016/j.jare.2026.01.050
MetsObesity: a novel classification system for predicting 15-year cardiovascular risk in the UK Biobank population.
  • Jan 19, 2026
  • Journal of advanced research
  • Junaid Iqbal + 7 more

MetsObesity: a novel classification system for predicting 15-year cardiovascular risk in the UK Biobank population.

  • New
  • Research Article
  • 10.1186/s12885-026-15554-w
Interpretable machine learning using CT radiomics predicts pathological upgrading after secondary resection in non-muscle-invasive bladder cancer.
  • Jan 19, 2026
  • BMC cancer
  • Xue Peng Rao + 6 more

Repeat transurethral resection (ReTUR) is essential for reducing residual and recurrent non-muscle-invasive bladder cancer (NMIBC). Pathological upstaging after ReTUR significantly influences prognosis. This study aimed to develop an interpretable machine learning model using CT radiomics to predict the risk of pathological upstaging following ReTUR in NMIBC patients. We retrospectively analyzed 104 NMIBC patients who underwent ReTUR at the Second Affiliated Hospital of Nanchang University from March 2019 to July 2022. Data were split 7:3 into training and internal validation sets. An external validation set included 40 patients from two other hospitals. Radiomic features were extracted from preoperative CT scans. Least Absolute Shrinkage and Selection Operator (LASSO) and multivariate logistic regression were used to identify predictors of pathological upstaging. Four machine learning models, including Extreme Gradient Boosting (XGBoost), Gradient Boosting Decision Tree (GBDT), Random Forest (RF), and Linear Discriminant Analysis (LDA), were constructed and evaluated using AUC, accuracy, precision, F1 score, calibration curves, and decision curve analysis (DCA). The best model was interpreted via SHapley Additive exPlanations (SHAP) to identify key predictive features. umor grade (OR = 7.02, 95% CI: 1.17-42.21), tumor size (OR = 5.83, 95% CI: 1.21-28.15), and tumor number (OR = 6.83, 95% CI: 1.18-39.52) were independent risk factors. From 4,738 radiomic features, nine were selected. The XGBoost model outperformed others, with an AUC of 0.804 (95% CI: 0.756-0.862), accuracy of 77.4%, precision of 82.7%, and F1 score of 0.701 in internal validation. External validation confirmed its robustness. SHAP analysis highlighted Wavelet_LLH_firstorder_Maximum.1, Gradient_ngtdm_Complexity, and tumor grade as top predictors. The model showed good calibration and clinical utility on DCA. An interpretable CT radiomics-based machine learning model integrating clinical and imaging features was developed to accurately predict pathological upstaging risk after ReTUR in NMIBC patients. This tool may support clinical decision-making for individualized treatment after multicenter validation.

  • New
  • Research Article
  • 10.1016/j.compbiomed.2026.111468
Multimodal diagnosis of Parkinson's disease with an internet-based collaborative agent architecture of medical language models.
  • Jan 19, 2026
  • Computers in biology and medicine
  • Eugenio Peixoto Junior + 5 more

Multimodal diagnosis of Parkinson's disease with an internet-based collaborative agent architecture of medical language models.

  • New
  • Research Article
  • 10.3390/ijgi15010045
Flood Susceptibility and Risk Assessment in Myanmar Using Multi-Source Remote Sensing and Interpretable Ensemble Machine Learning Model
  • Jan 19, 2026
  • ISPRS International Journal of Geo-Information
  • Zhixiang Lu + 4 more

This observation-based and explainable approach demonstrates the applicability of multi-source remote sensing for flood assessment in data-scarce regions, offering a robust scientific basis for flood management and spatial planning in monsoon-affected areas. Floods are among the most frequent and devastating natural hazards, particularly in developing countries such as Myanmar, where monsoon-driven rainfall and inadequate flood-control infrastructure exacerbate disaster impacts. This study presents a satellite-driven and interpretable framework for high-resolution flood susceptibility and risk assessment by integrating multi-source remote sensing and geospatial data with ensemble machine-learning models—Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM)—implemented on the Google Earth Engine (GEE) platform. Eleven satellite- and GIS-derived predictors were used, including the Digital Elevation Model (DEM), slope, curvature, precipitation frequency, the Normalized Difference Vegetation Index (NDVI), land-use type, and distance to rivers, to develop flood susceptibility models. The Jenks natural breaks method was applied to classify flood susceptibility into five categories across Myanmar. Both models achieved excellent predictive performance, with area under the receiver operating characteristic curve (AUC) values of 0.943 for XGBoost and 0.936 for LightGBM, effectively distinguishing flood-prone from non-prone areas. XGBoost estimated that 26.1% of Myanmar’s territory falls within medium- to high-susceptibility zones, while LightGBM yielded a similar estimate of 25.3%. High-susceptibility regions were concentrated in the Ayeyarwady Delta, Rakhine coastal plains, and the Yangon region. SHapley Additive exPlanations (SHAP) analysis identified precipitation frequency, NDVI, and DEM as dominant factors, highlighting the ability of satellite-observed environmental indicators to capture flood-relevant surface processes. To incorporate exposure, population density and nighttime-light intensity were integrated with the susceptibility results to construct a natural–social flood risk framework. This observation-based and explainable approach demonstrates the applicability of multi-source remote sensing for flood assessment in data-scarce regions, offering a robust scientific basis for flood management and spatial planning in monsoon-affected areas.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers