Evaluation of a Gene Expression-Based Machine Learning Classifier to Discriminate Normal from Cancer Gastric Organoids

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Three-dimensional cell model systems such as tumour organoids allow for in vitro modelling of self-organized tissue with functional and histologic similarity to in vivo tissue. However, there is a need for standard protocols and techniques to confirm the presence of cancer within organoids derived from tumour tissue. The aim of this study was to assess the utility of a Nanostring gene expression-based machine learning classifier to determine the presence of cancer or normal organoids in cultures developed from both benign and cancerous stomach biopsies. A prospective cohort of normal and cancer stomach biopsies were collected from 2019 to 2022. Tissue specimens were processed for formalin-fixed paraffin-embedding (FFPE) and a subset of specimens were established in organoid cultures. Specimens were labelled as normal or cancer according to analysis of the FFPE tissue by two pathologists. The gene expression in FFPE and organoid tissue was measured using a 107 gene Nanostring codeset and normalized using the Removal of Unwanted Variation III algorithm. Our machine learning model was developed using five-fold nested cross-validation to classify normal or cancer gastric tissue from publicly available Asian Cancer Research Group (ACRG) gene expression data. The models were externally validated using the Cancer Genome Atlas (TCGA), as well as our own FFPE and organoid gene expression data. A total of 60 samples were collected, including 38 cancer FFPE specimens, 5 normal FFPE specimens, 12 cancer organoids, and 5 normal organoids. The optimal model design used a Least Absolute Shrinkage and Selection Operator model for feature selection and an ElasticNet model for classification, yielding area under the curve (AUC) values of 0.99 [95% CI: 0.99–1], 0.90 [95% CI: 0.87–0.93], and 0.79 [95% CI: 0.74–0.84] for ACRG (internal test), FFPE, and organoid (external test) data, respectively. The performance of our final model on external data achieved AUC values of 0.99 [95% CI: 0.98–1], 0.94 [95% CI: 0.86–1], and 0.85 [95% CI: 0.63–1] for TCGA, FFPE, and organoid specimens, respectively. Using a public database to create a machine learning model in combination with a Nanostring gene expression assay allows us to allocate organoids and their paired whole tissue samples. This platform yielded reasonable accuracy for FFPE and organoid specimens, with the former being more accurate. This study re-affirms that although organoids are a high-fidelity model, there are still limitations in validating the recapitulation of cancer in vitro.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 114
  • 10.1016/j.isci.2020.101411
Human Lung Adenocarcinoma-Derived Organoid Models for Drug Screening
  • Jul 25, 2020
  • iScience
  • Zhichao Li + 14 more

SummaryLung cancer is an extremely heterogeneous disease, and its treatment remains one of the most challenging tasks in medicine. Few existing laboratory lung cancer models can faithfully recapitulate the diversity of the disease and predict therapy response. Here, we establish 12 patient-derived organoids from the most common lung cancer subtype, lung adenocarcinoma (LADC). Extensive gene and histopathology profiling show that the tumor organoids retain the histological architectures, genomic landscapes, and gene expression profiles of their parental tumors. Patient-derived lung cancer organoids are amenable for biomarker identification and high-throughput drug screening in vitro. This study should enable the generation of patient-derived lung cancer organoid lines, which can be used to further the understanding of lung cancer pathophysiology and to assess drug response in personalized medicine.

  • Research Article
  • Cite Count Icon 34
  • 10.1007/s00330-020-07083-2
Improved long-term prognostic value of coronary CT angiography-derived plaque measures and clinical parameters on adverse cardiac outcome using machine learning
  • Jul 28, 2020
  • European Radiology
  • Christian Tesche + 13 more

To evaluate the long-term prognostic value of coronary CT angiography (cCTA)-derived plaque measures and clinical parameters on major adverse cardiac events (MACE) using machine learning (ML). Datasets of 361 patients (61.9 ± 10.3years, 65% male) with suspected coronary artery disease (CAD) who underwent cCTA were retrospectively analyzed. MACE was recorded. cCTA-derived adverse plaque features and conventional CT risk scores together with cardiovascular risk factors were provided to a ML model to predict MACE. A boosted ensemble algorithm (RUSBoost) utilizing decision trees as weak learners with repeated nested cross-validation to train and validate the model was used. Performance of the ML model was calculated using the area under the curve (AUC). MACE was observed in 31 patients (8.6%) after a median follow-up of 5.4years. Discriminatory power was significantly higher for the ML model (AUC 0.96 [95%CI 0.93-0.98]) compared with conventional CT risk scores including Agatston calcium score (AUC 0.84 [95%CI 0.80-0.87]), segment involvement score (AUC 0.88 [95%CI 0.84-0.91]), and segment stenosis score (AUC 0.89 [95%CI 0.86-0.92], all p < 0.05). Similar results were shown for adverse plaque measures (AUCs 0.72-0.82, all p < 0.05) and clinical parameters including the Framingham risk score (AUCs 0.71-0.76, all p < 0.05). The ML model yielded significantly higher diagnostic performance compared with logistic regression analysis (AUC 0.96 vs. 0.92, p = 0.024). Integration of a ML model improves the long-term prediction of MACE when compared with conventional CT risk scores, adverse plaque measures, and clinical information. ML algorithms may improve the integration of patient's information to enhance risk stratification. • A machine learning (ML) model portends high discriminatory power to predict major adverse cardiac events (MACE). • ML-based risk stratification shows superior diagnostic performance for MACE prediction over coronary CT angiography (cCTA)-derived risk scores or clinical parameters alone. • A ML model outperforms conventional logistic regression analysis for the prediction of MACE.

  • Research Article
  • Cite Count Icon 14
  • 10.1007/s00261-021-03051-6
Predicting the stages of liver fibrosis with multiphase CT radiomics based on volumetric features.
  • Mar 22, 2021
  • Abdominal Radiology
  • Enming Cui + 6 more

To develop and externally validate a multiphase computed tomography (CT)-based machine learning (ML) model for staging liver fibrosis (LF) by using whole liver slices. The development dataset comprised 232 patients with pathological analysis for LF, and the test dataset comprised 100 patients from an independent outside institution. Feature extraction was performed based on the precontrast (PCP), arterial (AP), portal vein (PVP) phase, and three-phase CT images. CatBoost was utilized for ML model investigation by using the features with good reproducibility. The diagnostic performance of ML models based on each single- and three-phase CT image was compared with that of radiologists' interpretations, the aminotransferase-to-platelet ratio index, and the fibrosis index based on four factors (FIB-4) by using the receiver operating characteristic curve with the area under the curve (AUC) value. Although the ML model based on three-phase CT image (AUC = 0.65-0.80) achieved higher AUC value than that based on PCP (AUC = 0.56-0.69) and PVP (AUC = 0.51-0.74) in predicting various stage of LF, significant difference was not found. The best CT-based ML model (AUC = 0.65-0.80) outperformed the FIB-4 in differentiating advanced LF and cirrhosis and radiologists' interpretation (AUC = 0.50-0.76) in the diagnosis of significant and advanced LF. All PCP, PVP, and three-phase CT-based ML models can be an acceptable in assessing LF, and the performance of the PCP-based ML model is comparable to that of the enhanced CT image-based ML model.

  • Research Article
  • 10.1182/blood-2024-211964
Systematic Review of Machine Learning Models for Myelodysplastic Syndrome Diagnosis
  • Nov 5, 2024
  • Blood
  • Karna Desai + 5 more

Systematic Review of Machine Learning Models for Myelodysplastic Syndrome Diagnosis

  • Research Article
  • Cite Count Icon 1
  • 10.1186/s12874-025-02694-z
Comparison of machine learning methods versus traditional Cox regression for survival prediction in cancer using real-world data: a systematic literature review and meta-analysis
  • Oct 28, 2025
  • BMC Medical Research Methodology
  • Yinan Huang + 6 more

BackgroundAccurate prediction of survival in oncology can guide targeted interventions. The traditional regression-based Cox proportional hazards (CPH) model has statistical assumptions and may have limited predictive accuracy. With the capability to model large datasets, machine learning (ML) holds the potential to improve the prediction of time-to-event outcomes, such as cancer survival outcomes. The present study aimed to systematically summarize the use of ML models for cancer survival outcomes in observational studies and to compare the performance of ML models with CPH models.MethodsWe systematically searched PubMed, MEDLINE (via EBSCO), and Embase for studies that evaluated ML models vs. CPH models for cancer survival outcomes. The use of ML algorithms was summarized, and either the area under the curve (AUC) or the concordance index (C-index) for the ML and CPH models were presented descriptively. Only studies that provided a measure of discrimination, i.e., AUC or C-index, and 95% confidence interval (CI) were included in the final meta-analysis. A random-effects model was used to compare the predictive performance in the pooled AUC or C-index estimates between ML and CPH models using R. The quality of the studies was evaluated using available checklists. Multiple sensitivity analyses were performed.ResultsA total of 21 studies were included for systematic review and 7 for meta-analysis. Across the 21 articles, diverse ML models were used, including random survival forest (N=16, 76.19%), gradient boosting (N=5, 23.81%), and deep learning (N=8, 38.09%). In predicting cancer survival outcomes, ML models showed no superior performance over CPH regression. The standardized mean difference in AUC or C-index was 0.01 (95% CI: -0.01 to 0.03). Results from the sensitivity analyses confirmed the robustness of the main findings.ConclusionsML models had similar performance compared with CPH models in predicting cancer survival outcomes. Although this systematic review highlights the promising use of ML to improve the quality of care in oncology, findings from this review also suggest opportunities to improve ML reporting transparency. Future systematic reviews should focus on the comparative performance between specific ML models and CPH regression in time-to-event outcomes in specific type of cancer or other disease areas.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12874-025-02694-z.

  • Research Article
  • Cite Count Icon 6
  • 10.1016/j.bbadis.2025.167693
Single-cell RNA sequencing and machine learning provide candidate drugs against drug-tolerant persister cells in colorectal cancer.
  • Mar 1, 2025
  • Biochimica et biophysica acta. Molecular basis of disease
  • Yosui Nojima + 2 more

Single-cell RNA sequencing and machine learning provide candidate drugs against drug-tolerant persister cells in colorectal cancer.

  • Research Article
  • 10.29271/jcpsp.2025.08.1007
Predicting Extracorporeal Shock Wave Lithotripsy Outcomes Using Machine Learning and the Triple-/Quadruple-D Scores.
  • Aug 1, 2025
  • Journal of the College of Physicians and Surgeons--Pakistan : JCPSP
  • Mucahit Gelmis + 5 more

To evaluate the predictive performance of the triple-D and quadruple-D scores integrated with machine learning (ML) models in determining stone-free outcomes after extracorporeal shock wave lithotripsy (ESWL), and to compare ML model performance and identify its key predictors influencing ESWL success. An observational study. Place and Duration of the Study: Department of Urology, Gaziosmanpasa Training and Research Hospital, Istanbul, Turkiye, from October 2020 to November 2024. A total of 309 patients who underwent ESWL were analysed. The patients were categorised into stone-free and non-stone- free groups based on post-treatment imaging. Clinical parameters, including quadruple-D score (stone volume, density, skin-to-stone distance [SSD], and location), were recorded. Three ML models‒random forest (RF), logistic regression (LR), and neural network (NN)‒were trained on 80% of the dataset and tested on 20%. Model performance was assessed using accuracy, area under the curve (AUC), precision, recall, and F1 score. The quadruple-D score (AUC: 0.724) demonstrated superior predictive power compared to the Triple-D score (AUC: 0.700). Among ML models, RF achieved the highest accuracy (82.9%, AUC: 0.91), followed by NN (80.9%, AUC: 0.87) and LR (79.6%, AUC: 0.85). Significant predictors of ESWL success were stone density, volume, SSD, and the quadruple-D score, while age and body mass index (BMI) were not significant. Integrating the quadruple-D score with ML models, particularly RF, enhances the prediction of ESWL outcomes. Combining clinical expertise with computational intelligence can refine patient selection and optimise treatment strategies. However, prospective studies are needed to validate these findings. Extracorporeal shock wave lithotripsy, Quadruple-D score, Machine learning, Random forest, Stone-free prediction.

  • Research Article
  • Cite Count Icon 2
  • 10.1097/md.0000000000038513
Performance evaluation of ML models for preoperative prediction of HER2-low BC based on CE-CBBCT radiomic features: A prospective study
  • Jun 14, 2024
  • Medicine
  • Xianfei Chen + 3 more

To explore the value of machine learning (ML) models based on contrast-enhanced cone-beam breast computed tomography (CE-CBBCT) radiomics features for the preoperative prediction of human epidermal growth factor receptor 2 (HER2)-low expression breast cancer (BC). Fifty-six patients with HER2-negative invasive BC who underwent preoperative CE-CBBCT were prospectively analyzed. Patients were randomly divided into training and validation cohorts at approximately 7:3. A total of 1046 quantitative radiomic features were extracted from CE-CBBCT images and normalized using z-scores. The Pearson correlation coefficient and recursive feature elimination were used to identify the optimal features. Six ML models were constructed based on the selected features: linear discriminant analysis (LDA), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost (AB), and decision tree (DT). To evaluate the performance of these models, receiver operating characteristic curves and area under the curve (AUC) were used. Seven features were selected as the optimal features for constructing the ML models. In the training cohort, the AUC values for SVM, LDA, RF, LR, AB, and DT were 0.984, 0.981, 1.000, 0.970, 1.000, and 1.000, respectively. In the validation cohort, the AUC values for the SVM, LDA, RF, LR, AB, and DT were 0.859, 0.880, 0.781, 0.880, 0.750, and 0.713, respectively. Among all ML models, the LDA and LR models demonstrated the best performance. The DeLong test showed that there were no significant differences among the receiver operating characteristic curves in all ML models in the training cohort (P > .05); however, in the validation cohort, the DeLong test showed that the differences between the AUCs of LDA and RF, AB, and DT were statistically significant (P = .037, .003, .046). The AUCs of LR and RF, AB, and DT were statistically significant (P = .023, .005, .030). Nevertheless, no statistically significant differences were observed when compared to the other ML models. ML models based on CE-CBBCT radiomics features achieved excellent performance in the preoperative prediction of HER2-low BC and could potentially serve as an effective tool to assist in precise and personalized targeted therapy.

  • Research Article
  • Cite Count Icon 35
  • 10.1155/2021/9980410
Reproduction of the Cancer Genome Atlas (TCGA) and Asian Cancer Research Group (ACRG) Gastric Cancer Molecular Classifications and Their Association with Clinicopathological Characteristics and Overall Survival in Moroccan Patients
  • Jul 28, 2021
  • Disease Markers
  • Jean Paul Nshizirungu + 15 more

Introduction The Cancer Genome Atlas (TCGA) project and Asian Cancer Research Group (ACRG) recently categorized gastric cancer into molecular subtypes. Nevertheless, these classification systems require high cost and sophisticated molecular technologies, preventing their widespread use in the clinic. This study is aimed to generating molecular subtypes of gastric cancer using techniques available in routine diagnostic practice in a series of Moroccan gastric cancer patients. In addition, we assessed the associations between molecular subtypes, clinicopathological features, and prognosis. Methods Ninety-seven gastric cancer cases were classified according to TCGA, ACRG, and integrated classifications using a panel of four molecular markers (EBV, MSI, E-cadherin, and p53). HER2 status and PD-L1 expression were also evaluated. These markers were analyzed using immunohistochemistry (E-cadherin, p53, HER2, and PD-L1), in situ hybridization (EBV and HER2 equivocal cases), and multiplex PCR (MSI). Results Our results showed that the subtypes presented distinct clinicopathological features and prognosis. EBV-positive gastric cancers were found exclusively in male patients. The GS (TCGA classification), MSS/EMT (ACRG classification), and E-cadherin aberrant subtype (integrated classification) presented the Lauren diffuse histology enrichment and tended to be diagnosed at a younger age. The MSI subtype was associated with a better overall survival across all classifications (TCGA, ACRG, and integrated classification). The worst prognosis was observed in the EBV subtype (TCGA and integrated classification) and MSS/EMT subtype (ACRG classification). Discussion/Conclusion. We reported a reproducible and affordable gastric cancer subtyping algorithms that can reproduce the recently recognized TCGA, ACRG, and integrated gastric cancer classifications, using techniques available in routine diagnosis. These simplified classifications can be employed not only for molecular classification but also in predicting the prognosis of gastric cancer patients.

  • Research Article
  • 10.1007/s10620-025-09646-z
Value of Endoscopic Ultrasonography for Distinguishing Malignant from Benign Non-pancreatic Periampullary Lesions: An Explainable Machine Learning Study.
  • Jan 9, 2026
  • Digestive diseases and sciences
  • Xue-Yong Zuo + 2 more

Early discrimination of non-pancreatic periampullary lesions (NPLs) is challenging owing to their complex anatomy and the absence of representative clinical symptoms. To establish an interpretable machine learning (ML) model that integrates clinical variables and endoscopic ultrasonography (EUS) features to diagnose NPLs. A total of 158 patients, suspected of having NPLs and who underwent EUS, were enrolled and randomly allocated into a training cohort (TC, n = 110) and a validation cohort (VC, n = 48). Risk clinical and EUS features were identified by multivariate logistic regression analysis and subsequently input into five ML classifiers to develop predictive models. The performance of ML models was assessed using the area under the curve (AUC), calibration curve, and decision curve analysis (DCA). The Shapley Additive Explanations (SHAP) approach was employed to interpret the result of the optimal ML model. Among the five ML models developed, the ExtraTrees model achieved the highest AUC values of 0.94 (95% confidence interval (CI): 0.89-0.99) and 0.94 (95% CI: 0.82-1.00) in TC and VC, respectively. This performance was followed by the extreme gradient boosting model (AUC = 0.94/0.93), the light gradient boosting machine (AUC = 0.92/0.91), the support vector machine (AUC = 0.91/0.94), and the logistic regression model (AUC = 0.86/0.87). The calibration curve and DCA graphically suggested good agreement and superior clinical benefits for the ExtraTrees model. SHAP analysis identified abdominal discomfort, lesion diameter, irregular shape, surface ulceration, and nonsmooth margin as the most influential features in the model's decision-making process. Our developed ML model exhibited superior capability and higher clinical benefit in distinguishing malignant from benign NPLs, particularly the ExtraTrees model. Furthermore, the SHAP analysis provided insightful interpretation of the ExtraTrees model for individualized and transparent prediction of NPLs.

  • Research Article
  • 10.1101/2024.10.17.24315710
Detecting Glaucoma Worsening Using Optical Coherence Tomography Derived Visual Field Estimates.
  • Oct 18, 2024
  • medRxiv : the preprint server for health sciences
  • Alex T Pham + 6 more

Multiple studies have attempted to generate visual field (VF) mean deviation (MD) estimates using cross-sectional optical coherence tomography (OCT) data. However, whether such models offer any value in detecting longitudinal VF progression is unclear. We address this by developing a machine learning (ML) model to convert OCT data to MD and assessing its ability to detect longitudinal worsening. Retrospective, longitudinal study. A model dataset of 70,575 paired OCT/VFs to train an ML model converting OCT to VF-MD. A separate progression dataset of 4,044 eyes with ≥ 5 paired OCT/VFs to assess the ability of OCT-derived MD to detect worsening. Progression dataset eyes had two additional unpaired VFs (≥ 7 total) to establish a "ground truth" rate of progression defined by MD slope. We trained an ML model using paired VF/OCT data to estimate MD measurements for each OCT scan (OCT-MD). We used this ML model to generate longitudinal OCT-MD estimates for progression dataset eyes. We calculated MD slopes after substituting/supplementing VF-MD with OCT-MD and measured the ability to detect progression. We labeled true progressors using a ground truth MD slope <0.5 dB/year calculated from ≥ 7 VF-MD measurements. We compared the area under the curve (AUC) of MD slopes calculated using both VF-MD (with <7 measurements) and OCT-MD. Because we found OCT-MD substitution had a statistically inferior AUC to VF-MD, we simulated the effect of reducing OCT-MD mean absolute error (MAE) on the ability to detect worsening. AUC. OCT-MD estimates had an MAE of 1.62 dB. AUC of MD slopes with partial OCT-MD substitution was significantly worse than the VF-MD slope. Supplementing VF-MD with OCT-MD also did not improve AUC, regardless of MAE. OCT-MD estimates needed an MAE ≤ 1.00 dB before AUC was statistically similar to VF-MD alone. ML models converting OCT data to VF-MD with error levels lower than published in prior work (MAE: 1.62 dB) were inferior to VF-MD data for detecting trend-based VF progression. Models converting OCT data to VF-MD must achieve better prediction errors (MAE ≤ 1 dB) to be clinically valuable at detecting VF worsening.

  • Research Article
  • Cite Count Icon 1
  • 10.1038/s41598-025-85695-8
The application of machine learning approaches to classify and predict fertility rate in Ethiopia
  • Jan 20, 2025
  • Scientific Reports
  • Ewunate Assaye Kassaw + 3 more

Integrating machine learning (ML) models into healthcare systems is a rapidly evolving field with the potential to revolutionize care delivery. This study aimed to classify fertility rates and identify significant predictors using ML models among reproductive women in Ethiopia. This study utilized eight ML models in 5864 reproductive-age women using Ethiopian Demographic Health Survey (EDHS), 2019 data. Phyton programming language was used to develop these models. Predictors of fertility rate were determined using the feature important techniques. The performance of models was evaluated using accuracy, area under the curve (AUC), precision, recall, F1-score, specificity, and sensitivity. The mean age of participants was 32.7 (± 5.6) years. The random forest classifier (accuracy = 0.901 and AUC = 0.961) followed by a one-dimensional convolutional neural network (accuracy = 0.899 and AUC = 0.958), logistic regression (accuracy = 0.874 and AUC = 0.937), and gradient boost classifier (accuracy = 0.851 and AUC 0.927) were the top performing ML models. Family size, age, occupation, and education with an average importance score of 0.198, 0.151, 0.118, and 0.081, respectively were the top significant predictors of the fertility rate. The best ML models to classify and predict fertility rates were random forest, one-dimensional convolutional neural network, logistic regression, and gradient boost classifier. The findings on important factors of fertility rate can inform targeted public health, programs that address disparities related to family size, occupation, education, and other socioeconomic factors.

  • Research Article
  • Cite Count Icon 1
  • 10.1186/s12933-025-02911-5
An ensemble machine learning-based risk stratification tool for 30-day mortality prediction in critically ill cardiovascular patients.
  • Sep 30, 2025
  • Cardiovascular diabetology
  • Mingxing Lei + 11 more

Early mortality prediction in critically ill patients with cardiovascular disease remains challenging. This study aimed to develop and validate an ensemble machine learning (ML) model to predict 30-day mortality, comparing its performance with conventional severity scores and interrogating the incremental prognostic value of stress hyperglycemia ratio (SHR). A retrospective cohort of 1,595 ICU patients with cardiovascular disease combined with diabetes (2008-2022) was analyzed. SHR was calculated as admission glucose divided by estimated average glucose (eAG) from HbA1c. Six ML models (eXtreme Gradient Boosting [XGBoost], Decision Tree [DT], Random Forest [RF], Artificial Neural Network [ANN], Logistic Regression [LR], and Support Vector Machine [SVM]) were trained on 80% of the data, with the top three performers combined into an ensemble model. Model performance was evaluated using area under the curve (AUC), precision-recall, calibration, and clinical utility metrics. The 30-day mortality rate was 10.8% in the entire cohort (n = 173). The ensemble model demonstrated superior predictive performance with an AUC of 0.912 (95% CI: 0.888-0.936), outperforming both individual ML models (XGBoost, AUC = 0.903) and traditional scoring systems (APS III/SOFA/SAPS II AUCs ≤ 0.742; all P < 0.001). The top six important predictors included anti-hypertensives, aspirin, blood urea nitrogen (BUN), white blood cell (WBC), age, and red blood cell (RBC), with the Shapley Additive Explanations analysis revealing clinically meaningful patterns: a nonlinear risk escalation for age, linear risk increases with rising BUN and bilirubin levels, a protective effect associated with higher RBC counts, and both low and high WBC levels linked to increased early death risk. While SHR significantly improved the performance of traditional scoring systems (e.g., increasing SOFA AUC from 0.741 to 0.757, P = 0.010), its addition to the ensemble model provided limited incremental benefit (ΔAUC = - 0.032, P = 0.094). External validation in an independent cohort (n = 307) confirmed the model's robustness (AUC = 0.891, 95% CI: 0.864-0.917), with decision curve analysis demonstrating superior clinical utility across a wide range of risk thresholds. The ensemble ML model outperformed conventional prognostic tools in predicting 30-day mortality, with SHR augmenting traditional tools but not the ensemble ML model. This approach offers a reliable, interpretable framework for risk stratification in high-risk cardiovascular patients.

  • Research Article
  • 10.1177/08850666251390848
An Interpretable Machine Learning Model for Early Multitemporal Prediction of Onset of Acute Kidney Injury in Intensive Care Unit Patients with Severe Trauma.
  • Oct 29, 2025
  • Journal of intensive care medicine
  • Bingrui Gao + 3 more

Acute Kidney Injury (AKI), a leading organ failure cause in critical patients, demands early high-risk identification to enhance outcomes. Yet comparative analyses of diagnostic and prognostic machine learning (ML) models across multiple post-admission timeframes are lacking. Using MIMIC-IV, we carried out using the Boruta algorithm for feature selection, developing and comparing six ML models to predict AKI risk at 0-24, 24-48, 48-72, 0-48, and 0-72 h post-ICU admission. Model performance was evaluated using the Area Under the Curve (AUC) and confusion matrix. Decision Curve and calibration analyses assessed clinical applicability. We compared models with Sequential Organ Failure Assessment (SOFA) and SAPSII scores to evaluate the accuracy of the ML models. Finally, Shapley Additive Explanations (SHAP) values interpreted and visualized key features of the optimal model. Our study involved 2092 trauma Intensive Care Unit (ICU) patients. Using the 17 selected out of the 48 features among trauma patients 24 h after ICU admissions, among the six ML models and two scoring systems, all ML models outperformed SOFA and SAPS II, and the extreme gradient boosting (XGBoost) exhibited the best performance, achieving an AUC of 0.948 (95% CI [0.929-0.966]) for AKI prediction within 24 h of admission, with an AUC of 0.941 ([0.892-0.917]) and 0.878 ([0.863-0.892]) at 0-48 and 0-72 h period, respectively. However, their predictive accuracies were very limited at 24-48 h (AUC 0.602 [0.562-0.643]) and 48-72 h (AUC 0.490 [0.429-0.551]), respectively. Urine output per kilogram per hour at 6 and 12 h and age were the most important features identified through SHAP analysis. Our study found ML models excel in diagnosing AKI risk in ICU trauma patients but have limited prognostic accuracy at 24-48 and 48-72 h post-admission. Further research is needed to improve this using time-series ML models with optimal windows.

  • Research Article
  • 10.1111/echo.70377
Machine Learning Models Integrating Two-Dimensional Speckle Tracking Echocardiography and Clinical Variables for Diagnosis of Severe Coronary Artery Disease.
  • Jan 1, 2026
  • Echocardiography (Mount Kisco, N.Y.)
  • Yuting Hu + 8 more

To develop and validate machine learning (ML) models integrating two-dimensional speckle tracking echocardiography (2D-STE) parameters with clinical variables for robust identification of severe coronary artery disease (sCAD). In this retrospective cohort study, five distinct ML models (Random Forest [RF], Support Vector Machine [SVM], K-Nearest Neighbors [KNN], Multi-Layer Perceptron [MLP], and Extremely Randomized Trees [Extra Trees]) were constructed to identify sCAD on a cohort of 204 patients (80% training set, 20% independent test set). Within the independent test set, two junior sonographers' diagnostic performance for sCADwas compared first without and then with ML assistance over a 2-week interval. SHapley Additive exPlanations (SHAP) analysis was applied to visualize and interpret the models, identifying key features driving sCAD prediction accuracy, with results visualized through dependence diagrams and force plot. Furthermore, a clinical nomogram integrating key predictors identified by ML models was developed to enable individualized quantification of sCAD risk. Utilizing five features, the MLP demonstrated the best performance with an area under the curve (AUC) of 0.870 and a sensitivity of 0.944. The SHAP visualization analysis for this modelindicated that "LV AP4 Endo Peak L. Time SD" significantly influenced its predictions. The MLP model (AUC = 0.870) outperformed both junior sonographers (AUC = 0.687) and a nomogram constructed from ML-selected features (AUC = 0.712). Additionally, the results revealed that junior sonographers achieved significantly improved performance when assisted by the ML models. The developed ML models could differentiate patients with angiography-confirmed sCAD from those without. Importantly, these models significantly improved the diagnostic performance of junior sonographers when used as an assistive tool.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.