Application of multi-scale feature extraction and explainable machine learning in chest x-ray position evaluation within an integrated learning framework.
This study presents a novel deep learning-machine learning fusion network for quantitative and interpretable assessment of chest X-ray positioning, aiming to analyze critical factors in patient positioning layout. In this retrospective study, we analyzed 3300 chest radiographs from a Chinese medical institution, collected between March 2021-December 2022. The dataset was partitioned into the XJ_chest_21 subset for training automated segmentation model and the XJ_chest_22 subset to validate three classification models: Random Forest Fusion Network (RFFN), Threshold Classification (TC), and Multivariate Logistic Regression (MLR). After automatically measuring five positioning indicators in the images, the data were input into the models to assess positioning quality. We compared the performance metrics of the three classification models, including AUC, accuracy, sensitivity, and specificity. SHAP (Shapley Additive Explanations) was utilized to interpret feature importance in the decision-making process of the RFFN model. We evaluated measurement consistency between the Automated Measurement Model (AMM) and radiologists. U-net++ demonstrated significantly superior performance compared to U-net in multi-target segmentation accuracy (mean Dice: 0.926 vs. 0.812). The five positioning metrics showed excellent agreement between AMM and reference standards (r = 0.93). ROC analysis indicated that RFFN performed significantly better in overall image quality classification (AUC, 0.982; 95% CI: 0.963, 0.993) compared to both TC (AUC, 0.959; 95% CI: 0.923, 0.995) and MLR (AUC, 0.953; 95% CI: 0.933, 0.974). Our study introduces a novel segmentation-based random forest fusion network that achieves accurate image positioning classification and identifies critical operational factors. Furthermore, the clinical interpretability of the fusion model was enhanced through the application of the SHAP method. Question How can AI-driven interpretable methods be utilized to assess patient positioning in chest radiography and enhance radiographers' accuracy? Findings The Random Forest Fusion Network (RFFN) outperformed Threshold Classification (TC) and Multivariate Logistic Regression (MLR) in positioning classification (AUC = 0.98). Clinical relevance An integrated framework that combines deep learning and machine learning achieves accurate image positioning classification, identifies critical operational factors, enables expert-level image quality assessment, and delivers automated feedback to radiographers.
- Research Article
12
- 10.1016/j.jobe.2023.108370
- Dec 24, 2023
- Journal of Building Engineering
This study explores the influence of concrete mix ingredients on the non-steady chloride migration coefficient (Dnssm) using an explainable machine learning (XML) approach that integrates Extreme Gradient Boosting (XGBoost) and Shapley Additive Explanations (SHAP). The dataset, comprising 204 observations from literature, is utilized to train the XGBoost algorithm for predicting Dnssm. The model demonstrates notable performance metrics with (MAE = 1.61 × 10−12 m2/s, RMSE = 2.38 × 10−12 m2/s, and R2 = 0.95) in the training set and (MAE = 2.22 × 10−12 m2/s, RMSE = 3.18 × 10−12 m2/s, and R2 = 0.87) and the test set. The SHAP method provides comprehensive insights into feature importance, offering valuable information about the relationships and dependencies among various features. The top five features identified as significant contributors include coarse aggregate, superplasticizer, concrete age, cement, and water. Visualization of SHAP values through diverse plots proves essential for obtaining a thorough understanding of feature influence. The explainability of the model's results contributes new insights, aiding in the development of optimal and sustainable concrete with enhanced resistance to chloride penetration. Furthermore, the model's explainability fosters trust in its predictions, facilitating seamless integration into real-world applications.
- Research Article
25
- 10.1148/radiol.2021210578
- Aug 31, 2021
- Radiology
Background A computer-aided detection (CAD) system may help surveillance for pulmonary metastasis at chest radiography in situations where there is limited access to CT. Purpose To evaluate whether a deep learning (DL)-based CAD system can improve diagnostic yield for newly visible lung metastasis on chest radiographs in patients with cancer. Materials and Methods A regulatory-approved CAD system for lung nodules was implemented to interpret chest radiographs from patients referred by the medical oncology department in clinical practice. In this retrospective diagnostic cohort study, chest radiographs interpreted with assistance from a CAD system after the implementation (January to April 2019, CAD-assisted interpretation group) and those interpreted before the implementation (September to December 2018, conventional interpretation group) of the CAD system were consecutively included. The diagnostic yield (frequency of true-positive detections) and false-referral rate (frequency of false-positive detections) of formal reports of chest radiographs for newly visible lung metastasis were compared between the two groups using generalized estimating equations. Propensity score matching was performed between the two groups for age, sex, and primary cancer. Results A total of 2916 chest radiographs from 1521 patients (1546 men, 1370 women; mean age, 62 years) and 5681 chest radiographs from 3456 patients (2941 men, 2740 women; mean age, 62 years) were analyzed in the CAD-assisted interpretation and conventional interpretation groups, respectively. The diagnostic yield for newly visible metastasis was higher in the CAD-assisted interpretation group (0.86%, 25 of 2916 [95% CI: 0.58, 1.3] vs 0.32%, 18 of 568 [95% CI: 0.20, 0.50%]; P = .004). The false-referral rate in the CAD-assisted interpretation group (0.34%, 10 of 2916 [95% CI: 0.19, 0.64]) was not inferior to that in the conventional interpretation group (0.25%, 14 of 5681 [95% CI: 0.15, 0.42]) at the noninferiority margin of 0.5% (95% CI of difference: -0.15, 0.35). Conclusion A deep learning-based computer-aided detection system improved the diagnostic yield for newly visible metastasis on chest radiographs in patients with cancer with a similar false-referral rate. © RSNA, 2021 Online supplemental material is available for this article.
- Research Article
- 10.1093/bjd/ljaf085.030
- Jun 27, 2025
- British Journal of Dermatology
Evidence-based precision medicine strategies do not currently exist to guide the choice of biologics in the treatment of psoriasis. As a result, a costly and arduous trial-and-error approach is often adopted. Artificial intelligence has the potential to improve personalization through the prediction of treatment outcomes using real-world data, such as that within the British Association of Dermatologists Biologics and Immunomodulators Register (BADBIR). We aimed to develop an explainable machine learning (ML) model to predict biologic drug discontinuation in a biologic-naive psoriasis cohort using BADBIR data. BADBIR data (2007–2024) were engineered to enable readability. Adult biologic-naive patients across all biologic cohorts with > 6 months of follow-up data were included. Recruitment centres representing 10% of the overall cohort were randomly separated for external validation (model testing). The residual cohort was then randomly split for model training (80%) and internal validation (20%, for hyperparameter tuning). Random forest modelling was applied for imputation of missing data. Only clinical data at baseline prior to biologic initiation were used for model training to enhance future clinical utilization. The performance of several ML (XG-Boost, AdaBoost, random forest) and deep learning (simple and recurrent neural networks) algorithms was evaluated. External validation was performed with a cross-validation leave-group-out approach of individual recruitment centres. SHAP (SHapley Additive exPlanations) and permutation feature importance values were generated to understand model predictions. In total, 10 806 patients were included, in the cohorts for training (n = 7722), internal validation (n = 1930) and external validation (for final model testing: nine centres, n = 1154). Most patients (n = 7290, 67%) discontinued initial biologic therapy within their follow-up duration (median 6.6 years). Within the discontinuation cohort, adalimumab (originator and biosimilars, 57%) was most prescribed. Higher proportions of female patients (43% vs. 37%) and patients with psoriatic arthritis (21% vs. 17%) and scalp psoriasis (59% vs. 51%) were noted in the discontinuation vs. the continuation cohort, respectively. AdaBoost, an ensemble ML model, outperformed other evaluated models with regards to area under the receiver operating characteristic curve (AUROC). Model testing predicted discontinuation of biologic therapy with (mean, 95% confidence interval) precision 0.85 (0.83–0.88), recall 0.80 (0.78–0.83), F1 score 0.82, AUROC 0.76 (0.71–0.78) and area under the precision recall curve (AUPRC) 0.83 (0.81–0.86). Performance metrics following testing with cross-validation [mean (SD)] were precision 0.79 (0.09), recall 0.69 (0.2), F1 score 0.74 (0.16), AUROC 0.71 (0.06) and AUPRC 0.75 (0.11). The features contributing most significantly to model performance were initial biologic drug, baseline Psoriasis Area and Severity Index, patient age, recruitment centre and baseline white cell count. In conclusion, AdaBoost represents an explainable, ML model with potential clinical utility to predict treatment outcomes of patients with psoriasis using real-world registry data. Future work will investigate discontinuation risk across a range of individual biologic therapies.
- Research Article
- 10.1093/bjd/ljaf085.200
- Jun 27, 2025
- British Journal of Dermatology
Evidence-based precision medicine strategies do not currently exist to guide the choice of biologics in the treatment of psoriasis. As a result, a costly and arduous trial-and-error approach is often adopted. Artificial intelligence has the potential to improve personalization through the prediction of treatment outcomes using real-world data, such as that within the British Association of Dermatologists Biologics and Immunomodulators Register (BADBIR). We aimed to develop an explainable machine learning (ML) model to predict biologic drug discontinuation in a biologic-naive psoriasis cohort using BADBIR data. BADBIR data (2007–2024) were engineered to enable readability. Adult biologic-naive patients across all biologic cohorts with > 6 months of follow-up data were included. Recruitment centres representing 10% of the overall cohort were randomly separated for external validation (model testing). The residual cohort was then randomly split for model training (80%) and internal validation (20%, for hyperparameter tuning). Random forest modelling was applied for imputation of missing data. Only clinical data at baseline prior to biologic initiation were used for model training to enhance future clinical utilization. The performance of several ML (XGBoost, AdaBoost, random forest) and deep learning (simple and recurrent neural networks) algorithms was evaluated. External validation was performed with a cross-validation leave-group-out approach of individual recruitment centres. SHAP (SHapley Additive exPlanations) and permutation feature importance values were generated to understand model predictions. In total, 10 806 patients were included, in the cohorts for training (n = 7722), internal validation (n = 1930) and external validation (for final model testing: nine centres, n = 1154). Most patients (n = 7290, 67%) discontinued initial biologic therapy within their follow-up duration (median 6.6 years). Within the discontinuation cohort, adalimumab (originator and biosimilars, 57%) was most prescribed. Higher proportions of female patients (43% vs. 37%) and patients with psoriatic arthritis (21% vs. 17%) and scalp psoriasis (59% vs. 51%) were noted in the discontinuation vs. the continuation cohort, respectively. AdaBoost, an ensemble ML model, outperformed other evaluated models with regards to area under the receiver operating characteristic curve (AUROC). Model testing predicted discontinuation of biologic therapy with (mean, 95% CI) precision 0.85 (0.83–0.88), recall 0.80 (0.78–0.83), F1 score 0.82, AUROC 0.76 (0.71–0.78) and area under the precision recall curve (AUPRC) 0.83 (0.81–0.86). Performance metrics following testing with cross-validation [mean (SD)] were precision 0.79 (0.09), recall 0.69 (0.2), F1 score 0.74 (0.16), AUROC 0.71 (0.06) and AUPRC 0.75 (0.11). The features contributing most significantly to model performance were initial biologic drug, baseline Psoriasis Area and Severity Index, patient age, recruitment centre and baseline white cell count. In conclusion, AdaBoost represents an explainable, ML model with potential clinical utility to predict treatment outcomes of patients with psoriasis using real-world registry data. Future work will investigate discontinuation risk across a range of individual biologic therapies.
- Research Article
- 10.1007/s00216-025-05816-0
- Mar 29, 2025
- Analytical and bioanalytical chemistry
This study proposes a rapid identification method for foodborne pathogens by combining Raman spectroscopy with explainable machine learning. Spectral data of nine common foodborne pathogens are collected using a laser confocal Raman spectrometer, and their characteristic Raman peaks are identified and analyzed. Key spectral features are extracted using competitive adaptive reweighted sampling (CARS) and the successive projections algorithm (SPA), while t-distributed stochastic neighbor embedding (t-SNE) is employed for visualization. Subsequently, classification models, including support vector machine (SVM) and random forest (RF), are developed, and the optimal model is selected based on classification accuracy (ACC), with the RF model achieving a test accuracy of 98.91%. To enhance the interpretability of the model, Shapley Additive exPlanations (SHAP) analysis is applied to evaluate the contribution of each spectral feature to the classification results, identifying critical Raman shifts significantly influencing pathogen classification. The results demonstrate that CARS-SPA feature selection not only improves the accuracy and efficiency of the classification model but also enhances its transparency and reliability. This study optimizes the workflow for food safety testing, reduces the risk of foodborne diseases, and provides robust technical support for public health and safety.
- Research Article
15
- 10.3389/frwa.2023.1112970
- Mar 14, 2023
- Frontiers in Water
Long short-term memory (LSTM) networks have demonstrated successful applications in accurately and efficiently predicting reservoir releases from hydrometeorological drivers including reservoir storage, inflow, precipitation, and temperature. However, due to its black-box nature and lack of process-based implementation, we are unsure whether LSTM makes good predictions for the right reasons. In this work, we use an explainable machine learning (ML) method, called SHapley Additive exPlanations (SHAP), to evaluate the variable importance and variable-wise temporal importance in the LSTM model prediction. In application to 30 reservoirs over the Upper Colorado River Basin, United States, we show that LSTM can accurately predict the reservoir releases with NSE ≥ 0.69 for all the considered reservoirs despite of their diverse storage sizes, functionality, elevations, etc. Additionally, SHAP indicates that storage and inflow are more influential than precipitation and temperature. Moreover, the storage and inflow show a relatively long-term influence on the release up to 7 days and this influence decreases as the lag time increases for most reservoirs. These findings from SHAP are consistent with our physical understanding. However, in a few reservoirs, SHAP gives some temporal importances that are difficult to interpret from a hydrological point of view, probably because of its ignorance of the variable interactions. SHAP is a useful tool for black-box ML model explanations, but the hydrological processes inferred from its results should be interpreted cautiously. More investigations of SHAP and its applications in hydrological modeling is needed and will be pursued in our future study.
- Research Article
- 10.3390/cancers17162614
- Aug 9, 2025
- Cancers
Background/Objectives: Gliomas are complex and heterogeneous brain tumors characterized by an unfavorable clinical course and a fatal prognosis, which can be improved by an early determination of tumor kind. Here, we developed explainable machine learning (ML) models for classifying three major glioma subtypes (astrocytoma, oligodendroglioma, and glioblastoma) and predicting survival rates based on RNA-seq data. Methods: We analyzed publicly available datasets and applied feature selection techniques to identify key biomarkers. Using various ML models, we performed classification and survival analysis to develop robust predictive models. The best-performing models were then interpreted using Shapley additive explanations (SHAP). Results: Thirteen key genes (TERT, NOX4, MMP9, TRIM67, ZDHHC18, HDAC1, TUBB6, ADM, NOG, CHEK2, KCNJ11, KCNIP2, and VEGFA) proved to be closely associated with glioma subtypes as well as survival. Support Vector Machine (SVM) turned out to be the optimal classification model with the balanced accuracy of 0.816 and the area under the receiver operating characteristic curve (AUC) of 0.896 for the test datasets. The Case-Control Cox regression model (CoxCC) proved best for predicting survival with the Harrell's C-index of 0.809 and 0.8 for the test datasets. Using SHAP we revealed the gene expression influence on the outputs of both models, thus enhancing the transparency of the prediction generation process. Conclusions: The results indicated that the developed models could serve as a valuable practical tool for clinicians, assisting them in diagnosing and determining optimal treatment strategies for patients with glioma.
- Research Article
- 10.1200/cci-24-00178
- Mar 1, 2025
- JCO clinical cancer informatics
This study aims to investigate the impact of tumor quadrant location on the 5-year early-stage breast cancer survivability prediction using explainable machine learning (ML) models. By integrating these predictive models with Shapley Additive Explanations (SHAP), feature importance, and coefficient effect size, we aim to provide insights into the significant factors influencing patient outcomes. Data from 401 early-stage patients with breast cancer at the University of Missouri's Ellis Fischel Cancer Center were used, encompassing 20 variables related to demographics, tumor characteristics, and therapeutics. Six ML models, namely, Xtreme Gradient Boosting, Random Forest classifier, Logistic Regression, Decision Tree classifier (DT), Support Vector Machine classifier, and AdaBoost (ADB), were trained and evaluated using various performance metrics, including accuracy, sensitivity, specificity, F1-score, area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR). Feature importance, coefficient effect size, and SHAP values were used to interpret and visualize the importance of different features, particularly focusing on tumor quadrant variables. The extreme gradient boosting model outperformed other models, achieving an AUC-ROC score of 0.98 and an AUC-PR score of 0.97. The analysis revealed that tumor quadrant variables, especially the upper outer and miscellaneous or overlapping sites, were among the top predictive features for breast cancer survivability. SHAP analysis further highlighted the significance of these tumor locations in influencing survival outcomes. This study demonstrates the efficacy of explainable ML models in predicting 5-year early-stage breast cancer survivability and identifies tumor quadrant location as an independent prognostic factor. The use of SHAP values provides a clear interpretation of the model's predictions, offering valuable insights for clinicians to refine treatment protocols and improve patient outcomes.
- Research Article
- 10.21037/tau-2025-350
- Oct 25, 2025
- Translational Andrology and Urology
BackgroundWhile environmental heavy metal exposure has been linked to various metabolic disorders, its association with overactive bladder (OAB) remains poorly characterized. Emerging evidence suggests body mass index (BMI) may mediate heavy metal-induced metabolic dysregulation, though underlying pathways remain unclear. This study investigates the interplay between heavy metal exposure, BMI, and OAB risk via explainable machine learning (ML) and mediation analysis.MethodsDrawing on data from the National Health and Nutrition Examination Survey (NHANES) [2005–2010], we identified OAB-associated heavy metals via least absolute shrinkage and selection operator (LASSO) regression and the Boruta algorithm, then developed ten ML models. The optimal model, Extreme Gradient Boosting (XGBoost), was selected based on performance metrics and interpreted via Permutation Feature Importance (PFI), Shapley Additive Explanations (SHAP), and Partial Dependence Plots (PDP). Dose-response relationships, mixture effects, and BMI-mediated pathways were validated through logistic regression (LR), restricted cubic splines (RCS), Bayesian kernel machine regression (BKMR), and mediation analysis.ResultsAmong 3,201 eligible participants, blood lead, blood iron, urinary barium, urinary cadmium, urinary thallium, and urinary mercury were identified as OAB-associated metals. The XGBoost model achieved superior predictive performance [area under the curve (AUC): 0.736]. PFI highlighted hypertension, urinary cadmium, and age as key OAB determinants, while SHAP emphasized urinary cadmium and blood iron as primary predictors. PDP revealed a positive cadmium-OAB association and an inverse iron-OAB relationship. LR confirmed blood iron [odds ratio (OR) =0.72, 95% confidence interval (CI): 0.57–0.90] and urinary cadmium (OR =1.23, 95% CI: 1.06–1.42) as independent risk factors. RCS demonstrated linear trends for cadmium/iron and nonlinear trends for lead. BKMR analysis confirmed a positive overall mixture effect (conditional posterior inclusion probabilities =0.9860), with urinary cadmium showing the strongest exposure-response relationship. Mediation analysis indicated BMI mediated 14.80% of iron’s protective effect and partially counteracted cadmium/lead risks (mediation proportions: −17.33%).ConclusionsUrinary cadmium, blood lead, and iron emerge as critical OAB risk modulators, with BMI serving as a partial mediator. Integrating explainable ML with conventional epidemiology elucidates environmental-metabolic interactions in OAB pathogenesis, underscoring the need for heavy metal screening and BMI management in high-risk populations.
- Research Article
1
- 10.4174/astr.2023.105.4.237
- Jan 1, 2023
- Annals of Surgical Treatment and Research
Sepsis is one of the most common causes of death after surgery. Several conventional scoring systems have been developed to predict the outcome of sepsis; however, their predictive power is insufficient. The present study applies explainable machine-learning algorithms to improve the accuracy of predicting postoperative mortality in patients with sepsis caused by peritonitis. We performed a retrospective analysis of data from demographic, clinical, and laboratory analyses, including the delta neutrophil index (DNI), WBC and neutrophil counts, and CRP level. Laboratory data were measured before surgery, 12-36 hours after surgery, and 60-84 hours after surgery. The primary study output was the probability of mortality. The areas under the receiver operating characteristic curves (AUCs) of several machine-learning algorithms using the Sequential Organ Failure Assessment (SOFA) and Simplified Acute Physiology Score (SAPS) 3 models were compared. 'SHapley Additive exPlanations' values were used to indicate the direction of the relationship between a variable and mortality. The CatBoost model yielded the highest AUC (0.933) for mortality compared to SAPS3 and SOFA (0.860 and 0.867, respectively). Increased DNI on day 3, septic shock, use of norepinephrine therapy, and increased international normalized ratio on day 3 had the greatest impact on the model's prediction of mortality. Machine-learning algorithms increase the accuracy of predicting postoperative mortality in patients with sepsis caused by peritonitis.
- Research Article
22
- 10.1016/j.cie.2023.109261
- Apr 23, 2023
- Computers & Industrial Engineering
Project control is a crucial phase within project management aimed at ensuring —in an integrated manner— that the project objectives are met according to plan. Earned Value Management —along with its various refinements— is the most popular and widespread method for top-down project control. For project control under uncertainty, Monte Carlo simulation and statistical/machine learning models extend the earned value framework by allowing the analysis of deviations, expected times and costs during project progress. Recent advances in explainable machine learning, in particular attribution methods based on Shapley values, can be used to link project control to activity properties, facilitating the interpretation of interrelations between activity characteristics and control objectives. This work proposes a new methodology that adds an explainability layer based on SHAP —Shapley Additive exPlanations— to different machine learning models fitted to Monte Carlo simulations of the project network during tracking control points. Specifically, our method allows for both prospective and retrospective analyses, which have different utilities: forward analysis helps to identify key relationships between the different tasks and the desired outcomes, thus being useful to make execution/replanning decisions; and backward analysis serves to identify the causes of project status during project progress. Furthermore, this method is general, model-agnostic and provides quantifiable and easily interpretable information, hence constituting a valuable tool for project control in uncertain environments.
- Research Article
3
- 10.1016/j.ebiom.2024.105244
- Jul 17, 2024
- eBioMedicine
Predicting Clostridioides difficile infection outcomes with explainable machine learning
- Research Article
- 10.7150/jca.110141
- Mar 3, 2025
- Journal of Cancer
Background: The occurrence of papillary thyroid cancer (PTC) has risen substantially and tends to exhibit early-stage lymph node metastasis (LNM), increasing the risk of postoperative recurrence and decreasing survival. There is a lack of a machine learning (ML) model to predict delphian LNM (DLNM) in PTC. This investigation seeks to comprehensively assess the significance of standard clinical indicators for DLNM prediction, while constructing a dependable and widely applicable ensemble ML framework to support surgical planning and therapeutic decision-making. Methods: This investigation incorporated 1993 sequential PTC patients who underwent curative surgical procedures from 2020 to 2023. Based on the time to surgery, we divided the cohort into the training cohort (n=1395) and the validation cohort (n=598). The Boruta algorithm was applied to select feature variables, succeeded by the development of an innovative ML structure combining 12 ML techniques across 113 permutations to create a unified prediction model (DLNM index). ROC analysis, calibration curve, Bootstrapping, 10-fold cross validation, restricted cubic spline (RCS) regression, multivariable logistic regression, and subgroup analysis were utilised to evaluate the predictive accuracy and discriminative ability of the DLNM index. Model interpretation and feature impact visualisation were accomplished through the Shapley Additive Explanations (SHAP) methodology. Results: Based on 14 features via the Boruta algorithm selection, we integrated them into 12 ML approaches, yielding 113 permutations, from which we identified the superior algorithm to establish a consensus ML-derived diagnostic model (DLNM index). The DLNM index exhibited excellent diagnostic values with a mean AUC of 0.763 in two cohorts and discriminative ability, serving as an independent risk factor (P < 0.001). It performed better in predicting performance and yielded a larger net benefit than the published model (P < 0.05). Bootstrapping and 10-fold cross validation, and subgroup analysis showed that the DLNM index was generally robust and generalisable. SHAP explains the importance of ranking features (tumour size, right 4 region LN, FT4, TG, and T3) and visualises global and individual risk prediction. RCS regression suggested a nonlinear link between the DLNM index, TG, tumour size, FT3, and DLNM risk. Conclusion: An optimised explainable model (DLNM index) comprising 12 clinical features based on multiple ML algorithms was constructed and validated to provide an economical, readily available, and precise diagnostic instrument for DLNM in PTC, which has potential implications for clinical practice. The SHAP explanation and RCS regression quantify and visualise tumour size and FT4 as the most important variables that increase DLNM risk.
- Research Article
31
- 10.3389/fmed.2021.663739
- Apr 23, 2021
- Frontiers in Medicine
Objective: The number of patients requiring prolonged mechanical ventilation (PMV) is increasing worldwide, but the weaning outcome prediction model in these patients is still lacking. We hence aimed to develop an explainable machine learning (ML) model to predict successful weaning in patients requiring PMV using a real-world dataset.Methods: This retrospective study used the electronic medical records of patients admitted to a 12-bed respiratory care center in central Taiwan between 2013 and 2018. We used three ML models, namely, extreme gradient boosting (XGBoost), random forest (RF), and logistic regression (LR), to establish the prediction model. We further illustrated the feature importance categorized by clinical domains and provided visualized interpretation by using SHapley Additive exPlanations (SHAP) as well as local interpretable model-agnostic explanations (LIME).Results: The dataset contained data of 963 patients requiring PMV, and 56.0% (539/963) of them were successfully weaned from mechanical ventilation. The XGBoost model (area under the curve [AUC]: 0.908; 95% confidence interval [CI] 0.864–0.943) and RF model (AUC: 0.888; 95% CI 0.844–0.934) outperformed the LR model (AUC: 0.762; 95% CI 0.687–0.830) in predicting successful weaning in patients requiring PMV. To give the physician an intuitive understanding of the model, we stratified the feature importance by clinical domains. The cumulative feature importance in the ventilation domain, fluid domain, physiology domain, and laboratory data domain was 0.310, 0.201, 0.265, and 0.182, respectively. We further used the SHAP plot and partial dependence plot to illustrate associations between features and the weaning outcome at the feature level. Moreover, we used LIME plots to illustrate the prediction model at the individual level. Additionally, we addressed the weekly performance of the three ML models and found that the accuracy of XGBoost/RF was ~0.7 between weeks 4 and week 7 and slightly declined to 0.6 on weeks 8 and 9.Conclusion: We used an ML approach, mainly XGBoost, SHAP plot, and LIME plot to establish an explainable weaning prediction ML model in patients requiring PMV. We believe these approaches should largely mitigate the concern of the black-box issue of artificial intelligence, and future studies are warranted for the landing of the proposed model.
- Research Article
5
- 10.3389/fphar.2023.1176096
- May 23, 2023
- Frontiers in Pharmacology
Background: Acute kidney injury (AKI), with an increase in serum creatinine, is a common adverse drug event. Although various clinical studies have investigated whether a combination of two nephrotoxic drugs has an increased risk of AKI using traditional statistical models such as multivariable logistic regression (MLR), the evaluation metrics have not been evaluated despite the fact that traditional statistical models may over-fit the data. The aim of the present study was to detect drug-drug interactions with an increased risk of AKI by interpreting machine-learning models to avoid overfitting.Methods: We developed six machine-learning models trained using electronic medical records: MLR, logistic least absolute shrinkage and selection operator regression (LLR), random forest, extreme gradient boosting (XGB) tree, and two support vector machine models (kernel = linear function and radial basis function). In order to detect drug-drug interactions, the XGB and LLR models that showed good predictive performance were interpreted by SHapley Additive exPlanations (SHAP) and relative excess risk due to interaction (RERI), respectively.Results: Among approximately 2.5 million patients, 65,667 patients were extracted from the electronic medical records, and assigned to case (N = 5,319) and control (N = 60,348) groups. In the XGB model, a combination of loop diuretic and histamine H2 blocker [mean (|SHAP|) = 0.011] was identified as a relatively important risk factor for AKI. The combination of loop diuretic and H2 blocker showed a significant synergistic interaction on an additive scale (RERI 1.289, 95% confidence interval 0.226–5.591) also in the LLR model.Conclusion: The present population-based case-control study using interpretable machine-learning models suggested that although the relative importance of the individual and combined effects of loop diuretics and H2 blockers is lower than that of well-known risk factors such as older age and sex, concomitant use of a loop diuretic and histamine H2 blocker is associated with increased risk of AKI.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.