Sustainable E-Health: Energy-Efficient Tiny AI for Epileptic Seizure Detection via EEG

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Tiny Artificial Intelligence (Tiny AI) is transforming resource-constrained embedded systems, particularly in e-health applications, by introducing a shift in Tiny Machine Learning (TinyML) and its integration with the Internet of Things (IoT). Unlike conventional machine learning (ML), which demands substantial processing power, TinyML strategically delegates processing requirements to the cloud infrastructure, allowing lightweight models to run on embedded devices. This study aimed to (i) Develop a TinyML workflow that details the steps for model creation and deployment in resource-constrained environments and (ii) apply the workflow to e-health applications for the real-time detection of epileptic seizures using electroencephalography (EEG) data. The methodology employs a dataset of 4097 EEG recordings per patient, each 23.5 seconds long, from 500 patients, to develop a robust and resilient model. The model was deployed using TinyML on microcontrollers tailored to hardware with limited resources. TensorFlow Lite (TFLite) efficiently runs ML models on small devices, such wearables. Simulation outcomes demonstrated significant performance, particularly in predicting epileptic seizures, with the ExtraTrees Classifier achieving a notable 99.6% Area Under the Curve (AUC) on the validation set. Because of its superior performance, the ExtraTrees Classifier was selected as the preferred model. For the optimized TinyML model, the accuracy remained practically unchanged, whereas inference time was significantly reduced. Additionally, the converted model had a smaller size of 256 KB, approximately ten times smaller, making it suitable for microcontrollers with a capacity of no more than 1 MB.These findings highlight the potential of TinyML to significantly enhance healthcare applications by enabling real-time, energy-efficient decision-making directly on local devices. This is especially valuable in scenarios with limited computing resources or during emergencies, as it reduces latency, ensures privacy, and operates without reliance on cloud infrastructure. Moreover, by reducing the size of training datasets needed, TinyML helps lower overall costs and minimizes the risk of overfitting, making it an even more cost-effective and reliable solution for healthcare innovations.

Similar Papers
  • Research Article
  • Cite Count Icon 33
  • 10.1007/s00330-020-07083-2
Improved long-term prognostic value of coronary CT angiography-derived plaque measures and clinical parameters on adverse cardiac outcome using machine learning
  • Jul 28, 2020
  • European Radiology
  • Christian Tesche + 13 more

To evaluate the long-term prognostic value of coronary CT angiography (cCTA)-derived plaque measures and clinical parameters on major adverse cardiac events (MACE) using machine learning (ML). Datasets of 361 patients (61.9 ± 10.3years, 65% male) with suspected coronary artery disease (CAD) who underwent cCTA were retrospectively analyzed. MACE was recorded. cCTA-derived adverse plaque features and conventional CT risk scores together with cardiovascular risk factors were provided to a ML model to predict MACE. A boosted ensemble algorithm (RUSBoost) utilizing decision trees as weak learners with repeated nested cross-validation to train and validate the model was used. Performance of the ML model was calculated using the area under the curve (AUC). MACE was observed in 31 patients (8.6%) after a median follow-up of 5.4years. Discriminatory power was significantly higher for the ML model (AUC 0.96 [95%CI 0.93-0.98]) compared with conventional CT risk scores including Agatston calcium score (AUC 0.84 [95%CI 0.80-0.87]), segment involvement score (AUC 0.88 [95%CI 0.84-0.91]), and segment stenosis score (AUC 0.89 [95%CI 0.86-0.92], all p < 0.05). Similar results were shown for adverse plaque measures (AUCs 0.72-0.82, all p < 0.05) and clinical parameters including the Framingham risk score (AUCs 0.71-0.76, all p < 0.05). The ML model yielded significantly higher diagnostic performance compared with logistic regression analysis (AUC 0.96 vs. 0.92, p = 0.024). Integration of a ML model improves the long-term prediction of MACE when compared with conventional CT risk scores, adverse plaque measures, and clinical information. ML algorithms may improve the integration of patient's information to enhance risk stratification. • A machine learning (ML) model portends high discriminatory power to predict major adverse cardiac events (MACE). • ML-based risk stratification shows superior diagnostic performance for MACE prediction over coronary CT angiography (cCTA)-derived risk scores or clinical parameters alone. • A ML model outperforms conventional logistic regression analysis for the prediction of MACE.

  • Research Article
  • Cite Count Icon 13
  • 10.1007/s00261-021-03051-6
Predicting the stages of liver fibrosis with multiphase CT radiomics based on volumetric features.
  • Mar 22, 2021
  • Abdominal Radiology
  • Enming Cui + 6 more

To develop and externally validate a multiphase computed tomography (CT)-based machine learning (ML) model for staging liver fibrosis (LF) by using whole liver slices. The development dataset comprised 232 patients with pathological analysis for LF, and the test dataset comprised 100 patients from an independent outside institution. Feature extraction was performed based on the precontrast (PCP), arterial (AP), portal vein (PVP) phase, and three-phase CT images. CatBoost was utilized for ML model investigation by using the features with good reproducibility. The diagnostic performance of ML models based on each single- and three-phase CT image was compared with that of radiologists' interpretations, the aminotransferase-to-platelet ratio index, and the fibrosis index based on four factors (FIB-4) by using the receiver operating characteristic curve with the area under the curve (AUC) value. Although the ML model based on three-phase CT image (AUC = 0.65-0.80) achieved higher AUC value than that based on PCP (AUC = 0.56-0.69) and PVP (AUC = 0.51-0.74) in predicting various stage of LF, significant difference was not found. The best CT-based ML model (AUC = 0.65-0.80) outperformed the FIB-4 in differentiating advanced LF and cirrhosis and radiologists' interpretation (AUC = 0.50-0.76) in the diagnosis of significant and advanced LF. All PCP, PVP, and three-phase CT-based ML models can be an acceptable in assessing LF, and the performance of the PCP-based ML model is comparable to that of the enhanced CT image-based ML model.

  • Research Article
  • 10.1182/blood-2024-211964
Systematic Review of Machine Learning Models for Myelodysplastic Syndrome Diagnosis
  • Nov 5, 2024
  • Blood
  • Karna Desai + 5 more

Systematic Review of Machine Learning Models for Myelodysplastic Syndrome Diagnosis

  • Research Article
  • 10.1186/s12874-025-02694-z
Comparison of machine learning methods versus traditional Cox regression for survival prediction in cancer using real-world data: a systematic literature review and meta-analysis
  • Oct 28, 2025
  • BMC Medical Research Methodology
  • Yinan Huang + 6 more

BackgroundAccurate prediction of survival in oncology can guide targeted interventions. The traditional regression-based Cox proportional hazards (CPH) model has statistical assumptions and may have limited predictive accuracy. With the capability to model large datasets, machine learning (ML) holds the potential to improve the prediction of time-to-event outcomes, such as cancer survival outcomes. The present study aimed to systematically summarize the use of ML models for cancer survival outcomes in observational studies and to compare the performance of ML models with CPH models.MethodsWe systematically searched PubMed, MEDLINE (via EBSCO), and Embase for studies that evaluated ML models vs. CPH models for cancer survival outcomes. The use of ML algorithms was summarized, and either the area under the curve (AUC) or the concordance index (C-index) for the ML and CPH models were presented descriptively. Only studies that provided a measure of discrimination, i.e., AUC or C-index, and 95% confidence interval (CI) were included in the final meta-analysis. A random-effects model was used to compare the predictive performance in the pooled AUC or C-index estimates between ML and CPH models using R. The quality of the studies was evaluated using available checklists. Multiple sensitivity analyses were performed.ResultsA total of 21 studies were included for systematic review and 7 for meta-analysis. Across the 21 articles, diverse ML models were used, including random survival forest (N=16, 76.19%), gradient boosting (N=5, 23.81%), and deep learning (N=8, 38.09%). In predicting cancer survival outcomes, ML models showed no superior performance over CPH regression. The standardized mean difference in AUC or C-index was 0.01 (95% CI: -0.01 to 0.03). Results from the sensitivity analyses confirmed the robustness of the main findings.ConclusionsML models had similar performance compared with CPH models in predicting cancer survival outcomes. Although this systematic review highlights the promising use of ML to improve the quality of care in oncology, findings from this review also suggest opportunities to improve ML reporting transparency. Future systematic reviews should focus on the comparative performance between specific ML models and CPH regression in time-to-event outcomes in specific type of cancer or other disease areas.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12874-025-02694-z.

  • Research Article
  • 10.29271/jcpsp.2025.08.1007
Predicting Extracorporeal Shock Wave Lithotripsy Outcomes Using Machine Learning and the Triple-/Quadruple-D Scores.
  • Aug 1, 2025
  • Journal of the College of Physicians and Surgeons--Pakistan : JCPSP
  • Mucahit Gelmis + 5 more

To evaluate the predictive performance of the triple-D and quadruple-D scores integrated with machine learning (ML) models in determining stone-free outcomes after extracorporeal shock wave lithotripsy (ESWL), and to compare ML model performance and identify its key predictors influencing ESWL success. An observational study. Place and Duration of the Study: Department of Urology, Gaziosmanpasa Training and Research Hospital, Istanbul, Turkiye, from October 2020 to November 2024. A total of 309 patients who underwent ESWL were analysed. The patients were categorised into stone-free and non-stone- free groups based on post-treatment imaging. Clinical parameters, including quadruple-D score (stone volume, density, skin-to-stone distance [SSD], and location), were recorded. Three ML models‒random forest (RF), logistic regression (LR), and neural network (NN)‒were trained on 80% of the dataset and tested on 20%. Model performance was assessed using accuracy, area under the curve (AUC), precision, recall, and F1 score. The quadruple-D score (AUC: 0.724) demonstrated superior predictive power compared to the Triple-D score (AUC: 0.700). Among ML models, RF achieved the highest accuracy (82.9%, AUC: 0.91), followed by NN (80.9%, AUC: 0.87) and LR (79.6%, AUC: 0.85). Significant predictors of ESWL success were stone density, volume, SSD, and the quadruple-D score, while age and body mass index (BMI) were not significant. Integrating the quadruple-D score with ML models, particularly RF, enhances the prediction of ESWL outcomes. Combining clinical expertise with computational intelligence can refine patient selection and optimise treatment strategies. However, prospective studies are needed to validate these findings. Extracorporeal shock wave lithotripsy, Quadruple-D score, Machine learning, Random forest, Stone-free prediction.

  • Research Article
  • Cite Count Icon 2
  • 10.1097/md.0000000000038513
Performance evaluation of ML models for preoperative prediction of HER2-low BC based on CE-CBBCT radiomic features: A prospective study
  • Jun 14, 2024
  • Medicine
  • Xianfei Chen + 3 more

To explore the value of machine learning (ML) models based on contrast-enhanced cone-beam breast computed tomography (CE-CBBCT) radiomics features for the preoperative prediction of human epidermal growth factor receptor 2 (HER2)-low expression breast cancer (BC). Fifty-six patients with HER2-negative invasive BC who underwent preoperative CE-CBBCT were prospectively analyzed. Patients were randomly divided into training and validation cohorts at approximately 7:3. A total of 1046 quantitative radiomic features were extracted from CE-CBBCT images and normalized using z-scores. The Pearson correlation coefficient and recursive feature elimination were used to identify the optimal features. Six ML models were constructed based on the selected features: linear discriminant analysis (LDA), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost (AB), and decision tree (DT). To evaluate the performance of these models, receiver operating characteristic curves and area under the curve (AUC) were used. Seven features were selected as the optimal features for constructing the ML models. In the training cohort, the AUC values for SVM, LDA, RF, LR, AB, and DT were 0.984, 0.981, 1.000, 0.970, 1.000, and 1.000, respectively. In the validation cohort, the AUC values for the SVM, LDA, RF, LR, AB, and DT were 0.859, 0.880, 0.781, 0.880, 0.750, and 0.713, respectively. Among all ML models, the LDA and LR models demonstrated the best performance. The DeLong test showed that there were no significant differences among the receiver operating characteristic curves in all ML models in the training cohort (P > .05); however, in the validation cohort, the DeLong test showed that the differences between the AUCs of LDA and RF, AB, and DT were statistically significant (P = .037, .003, .046). The AUCs of LR and RF, AB, and DT were statistically significant (P = .023, .005, .030). Nevertheless, no statistically significant differences were observed when compared to the other ML models. ML models based on CE-CBBCT radiomics features achieved excellent performance in the preoperative prediction of HER2-low BC and could potentially serve as an effective tool to assist in precise and personalized targeted therapy.

  • Research Article
  • 10.1007/s10620-025-09646-z
Value of Endoscopic Ultrasonography for Distinguishing Malignant from Benign Non-pancreatic Periampullary Lesions: An Explainable Machine Learning Study.
  • Jan 9, 2026
  • Digestive diseases and sciences
  • Xue-Yong Zuo + 2 more

Early discrimination of non-pancreatic periampullary lesions (NPLs) is challenging owing to their complex anatomy and the absence of representative clinical symptoms. To establish an interpretable machine learning (ML) model that integrates clinical variables and endoscopic ultrasonography (EUS) features to diagnose NPLs. A total of 158 patients, suspected of having NPLs and who underwent EUS, were enrolled and randomly allocated into a training cohort (TC, n = 110) and a validation cohort (VC, n = 48). Risk clinical and EUS features were identified by multivariate logistic regression analysis and subsequently input into five ML classifiers to develop predictive models. The performance of ML models was assessed using the area under the curve (AUC), calibration curve, and decision curve analysis (DCA). The Shapley Additive Explanations (SHAP) approach was employed to interpret the result of the optimal ML model. Among the five ML models developed, the ExtraTrees model achieved the highest AUC values of 0.94 (95% confidence interval (CI): 0.89-0.99) and 0.94 (95% CI: 0.82-1.00) in TC and VC, respectively. This performance was followed by the extreme gradient boosting model (AUC = 0.94/0.93), the light gradient boosting machine (AUC = 0.92/0.91), the support vector machine (AUC = 0.91/0.94), and the logistic regression model (AUC = 0.86/0.87). The calibration curve and DCA graphically suggested good agreement and superior clinical benefits for the ExtraTrees model. SHAP analysis identified abdominal discomfort, lesion diameter, irregular shape, surface ulceration, and nonsmooth margin as the most influential features in the model's decision-making process. Our developed ML model exhibited superior capability and higher clinical benefit in distinguishing malignant from benign NPLs, particularly the ExtraTrees model. Furthermore, the SHAP analysis provided insightful interpretation of the ExtraTrees model for individualized and transparent prediction of NPLs.

  • Research Article
  • 10.1101/2024.10.17.24315710
Detecting Glaucoma Worsening Using Optical Coherence Tomography Derived Visual Field Estimates.
  • Oct 18, 2024
  • medRxiv : the preprint server for health sciences
  • Alex T Pham + 6 more

Multiple studies have attempted to generate visual field (VF) mean deviation (MD) estimates using cross-sectional optical coherence tomography (OCT) data. However, whether such models offer any value in detecting longitudinal VF progression is unclear. We address this by developing a machine learning (ML) model to convert OCT data to MD and assessing its ability to detect longitudinal worsening. Retrospective, longitudinal study. A model dataset of 70,575 paired OCT/VFs to train an ML model converting OCT to VF-MD. A separate progression dataset of 4,044 eyes with ≥ 5 paired OCT/VFs to assess the ability of OCT-derived MD to detect worsening. Progression dataset eyes had two additional unpaired VFs (≥ 7 total) to establish a "ground truth" rate of progression defined by MD slope. We trained an ML model using paired VF/OCT data to estimate MD measurements for each OCT scan (OCT-MD). We used this ML model to generate longitudinal OCT-MD estimates for progression dataset eyes. We calculated MD slopes after substituting/supplementing VF-MD with OCT-MD and measured the ability to detect progression. We labeled true progressors using a ground truth MD slope <0.5 dB/year calculated from ≥ 7 VF-MD measurements. We compared the area under the curve (AUC) of MD slopes calculated using both VF-MD (with <7 measurements) and OCT-MD. Because we found OCT-MD substitution had a statistically inferior AUC to VF-MD, we simulated the effect of reducing OCT-MD mean absolute error (MAE) on the ability to detect worsening. AUC. OCT-MD estimates had an MAE of 1.62 dB. AUC of MD slopes with partial OCT-MD substitution was significantly worse than the VF-MD slope. Supplementing VF-MD with OCT-MD also did not improve AUC, regardless of MAE. OCT-MD estimates needed an MAE ≤ 1.00 dB before AUC was statistically similar to VF-MD alone. ML models converting OCT data to VF-MD with error levels lower than published in prior work (MAE: 1.62 dB) were inferior to VF-MD data for detecting trend-based VF progression. Models converting OCT data to VF-MD must achieve better prediction errors (MAE ≤ 1 dB) to be clinically valuable at detecting VF worsening.

  • Research Article
  • 10.1038/s41598-025-85695-8
The application of machine learning approaches to classify and predict fertility rate in Ethiopia
  • Jan 20, 2025
  • Scientific Reports
  • Ewunate Assaye Kassaw + 3 more

Integrating machine learning (ML) models into healthcare systems is a rapidly evolving field with the potential to revolutionize care delivery. This study aimed to classify fertility rates and identify significant predictors using ML models among reproductive women in Ethiopia. This study utilized eight ML models in 5864 reproductive-age women using Ethiopian Demographic Health Survey (EDHS), 2019 data. Phyton programming language was used to develop these models. Predictors of fertility rate were determined using the feature important techniques. The performance of models was evaluated using accuracy, area under the curve (AUC), precision, recall, F1-score, specificity, and sensitivity. The mean age of participants was 32.7 (± 5.6) years. The random forest classifier (accuracy = 0.901 and AUC = 0.961) followed by a one-dimensional convolutional neural network (accuracy = 0.899 and AUC = 0.958), logistic regression (accuracy = 0.874 and AUC = 0.937), and gradient boost classifier (accuracy = 0.851 and AUC 0.927) were the top performing ML models. Family size, age, occupation, and education with an average importance score of 0.198, 0.151, 0.118, and 0.081, respectively were the top significant predictors of the fertility rate. The best ML models to classify and predict fertility rates were random forest, one-dimensional convolutional neural network, logistic regression, and gradient boost classifier. The findings on important factors of fertility rate can inform targeted public health, programs that address disparities related to family size, occupation, education, and other socioeconomic factors.

  • Research Article
  • Cite Count Icon 1
  • 10.1186/s12933-025-02911-5
An ensemble machine learning-based risk stratification tool for 30-day mortality prediction in critically ill cardiovascular patients.
  • Sep 30, 2025
  • Cardiovascular diabetology
  • Mingxing Lei + 11 more

Early mortality prediction in critically ill patients with cardiovascular disease remains challenging. This study aimed to develop and validate an ensemble machine learning (ML) model to predict 30-day mortality, comparing its performance with conventional severity scores and interrogating the incremental prognostic value of stress hyperglycemia ratio (SHR). A retrospective cohort of 1,595 ICU patients with cardiovascular disease combined with diabetes (2008-2022) was analyzed. SHR was calculated as admission glucose divided by estimated average glucose (eAG) from HbA1c. Six ML models (eXtreme Gradient Boosting [XGBoost], Decision Tree [DT], Random Forest [RF], Artificial Neural Network [ANN], Logistic Regression [LR], and Support Vector Machine [SVM]) were trained on 80% of the data, with the top three performers combined into an ensemble model. Model performance was evaluated using area under the curve (AUC), precision-recall, calibration, and clinical utility metrics. The 30-day mortality rate was 10.8% in the entire cohort (n = 173). The ensemble model demonstrated superior predictive performance with an AUC of 0.912 (95% CI: 0.888-0.936), outperforming both individual ML models (XGBoost, AUC = 0.903) and traditional scoring systems (APS III/SOFA/SAPS II AUCs ≤ 0.742; all P < 0.001). The top six important predictors included anti-hypertensives, aspirin, blood urea nitrogen (BUN), white blood cell (WBC), age, and red blood cell (RBC), with the Shapley Additive Explanations analysis revealing clinically meaningful patterns: a nonlinear risk escalation for age, linear risk increases with rising BUN and bilirubin levels, a protective effect associated with higher RBC counts, and both low and high WBC levels linked to increased early death risk. While SHR significantly improved the performance of traditional scoring systems (e.g., increasing SOFA AUC from 0.741 to 0.757, P = 0.010), its addition to the ensemble model provided limited incremental benefit (ΔAUC = - 0.032, P = 0.094). External validation in an independent cohort (n = 307) confirmed the model's robustness (AUC = 0.891, 95% CI: 0.864-0.917), with decision curve analysis demonstrating superior clinical utility across a wide range of risk thresholds. The ensemble ML model outperformed conventional prognostic tools in predicting 30-day mortality, with SHR augmenting traditional tools but not the ensemble ML model. This approach offers a reliable, interpretable framework for risk stratification in high-risk cardiovascular patients.

  • Research Article
  • 10.1177/08850666251390848
An Interpretable Machine Learning Model for Early Multitemporal Prediction of Onset of Acute Kidney Injury in Intensive Care Unit Patients with Severe Trauma.
  • Oct 29, 2025
  • Journal of intensive care medicine
  • Bingrui Gao + 3 more

Acute Kidney Injury (AKI), a leading organ failure cause in critical patients, demands early high-risk identification to enhance outcomes. Yet comparative analyses of diagnostic and prognostic machine learning (ML) models across multiple post-admission timeframes are lacking. Using MIMIC-IV, we carried out using the Boruta algorithm for feature selection, developing and comparing six ML models to predict AKI risk at 0-24, 24-48, 48-72, 0-48, and 0-72 h post-ICU admission. Model performance was evaluated using the Area Under the Curve (AUC) and confusion matrix. Decision Curve and calibration analyses assessed clinical applicability. We compared models with Sequential Organ Failure Assessment (SOFA) and SAPSII scores to evaluate the accuracy of the ML models. Finally, Shapley Additive Explanations (SHAP) values interpreted and visualized key features of the optimal model. Our study involved 2092 trauma Intensive Care Unit (ICU) patients. Using the 17 selected out of the 48 features among trauma patients 24 h after ICU admissions, among the six ML models and two scoring systems, all ML models outperformed SOFA and SAPS II, and the extreme gradient boosting (XGBoost) exhibited the best performance, achieving an AUC of 0.948 (95% CI [0.929-0.966]) for AKI prediction within 24 h of admission, with an AUC of 0.941 ([0.892-0.917]) and 0.878 ([0.863-0.892]) at 0-48 and 0-72 h period, respectively. However, their predictive accuracies were very limited at 24-48 h (AUC 0.602 [0.562-0.643]) and 48-72 h (AUC 0.490 [0.429-0.551]), respectively. Urine output per kilogram per hour at 6 and 12 h and age were the most important features identified through SHAP analysis. Our study found ML models excel in diagnosing AKI risk in ICU trauma patients but have limited prognostic accuracy at 24-48 and 48-72 h post-admission. Further research is needed to improve this using time-series ML models with optimal windows.

  • Research Article
  • 10.1111/echo.70377
Machine Learning Models Integrating Two-Dimensional Speckle Tracking Echocardiography and Clinical Variables for Diagnosis of Severe Coronary Artery Disease.
  • Jan 1, 2026
  • Echocardiography (Mount Kisco, N.Y.)
  • Yuting Hu + 8 more

To develop and validate machine learning (ML) models integrating two-dimensional speckle tracking echocardiography (2D-STE) parameters with clinical variables for robust identification of severe coronary artery disease (sCAD). In this retrospective cohort study, five distinct ML models (Random Forest [RF], Support Vector Machine [SVM], K-Nearest Neighbors [KNN], Multi-Layer Perceptron [MLP], and Extremely Randomized Trees [Extra Trees]) were constructed to identify sCAD on a cohort of 204 patients (80% training set, 20% independent test set). Within the independent test set, two junior sonographers' diagnostic performance for sCADwas compared first without and then with ML assistance over a 2-week interval. SHapley Additive exPlanations (SHAP) analysis was applied to visualize and interpret the models, identifying key features driving sCAD prediction accuracy, with results visualized through dependence diagrams and force plot. Furthermore, a clinical nomogram integrating key predictors identified by ML models was developed to enable individualized quantification of sCAD risk. Utilizing five features, the MLP demonstrated the best performance with an area under the curve (AUC) of 0.870 and a sensitivity of 0.944. The SHAP visualization analysis for this modelindicated that "LV AP4 Endo Peak L. Time SD" significantly influenced its predictions. The MLP model (AUC = 0.870) outperformed both junior sonographers (AUC = 0.687) and a nomogram constructed from ML-selected features (AUC = 0.712). Additionally, the results revealed that junior sonographers achieved significantly improved performance when assisted by the ML models. The developed ML models could differentiate patients with angiography-confirmed sCAD from those without. Importantly, these models significantly improved the diagnostic performance of junior sonographers when used as an assistive tool.

  • Research Article
  • 10.1161/str.56.suppl_1.wmp26
Abstract WMP26: Clinical Characteristic-Driven Machine Learning Models for the Prediction of Stroke Subtypes and Eligibility of Endovascular Thrombectomy
  • Feb 1, 2025
  • Stroke
  • Tomohide Yoshie + 12 more

Introduction: Several pre-hospital scales predict ischemic stroke due to large vessel occlusion; however, these scales often fail to detect atypical endovascular thrombectomy (EVT) cases, including posterior circulation and distal vessel occlusion. Additionally, these pre-hospital scales cannot differentiate between stroke subtypes, such as ischemic stroke and intracerebral hemorrhage. We aimed to develop machine learning (ML) models to predict stroke subtypes and EVT eligibility. Methods: We conducted an analysis using data from the Japan Stroke Data Bank, a nationwide acute stroke registry. Patients with ischemic stroke, intracerebral hemorrhage, or subarachnoid hemorrhage who were hospitalized between 2016 and 2020 were included in this study. We developed two ML models: (1) a model to predict patients who received EVT, and (2) to predict stroke categories (ischemic stroke treated with EVT, ischemic stroke treated with tPA, ischemic stroke without EVT or tPA, intracerebral hemorrhage, and subarachnoid hemorrhage). Patient data were divided into the derivation cohort (data from 2016 to 2019) and the validation cohort (data in 2020). The input variables for the ML models consisted of 129 clinical characteristics, including past medical history and neurological examination findings. Imaging and laboratory data were not utilized in the ML process. We initially developed the ML models using LightGBM algorithm with all 129 variables. Subsequently, we selected the top 10 variables for model development by the feature importance and Brute-force method, comparing commonly used pre-hospital scales to predict patients with EVT. Results: Of the 62,588 patients included in the study, 4,353 were treated with EVT, 3,001 with tPA, 39,710 had ischemic stroke treated without EVT or tPA, 12,498 had intracerebral hemorrhage, and 3,028 had subarachnoid hemorrhage. The area under the curve (AUC) for predicting patients treated with EVT was 0.828 (95% CI 0.817-0.839). For predicting the five stroke categories, the multiclass AUC was 0.848, and the micro-average F1 score was 0.705. Using the brute-force algorithm to select 10 variables, the ML model for predicting EVT treatment achieved an AUC of 0.794 (95% CI 0.784-0.805), outperforming commonly used pre-hospital scales (AUC 0.584-0.690). Conclusions: The ML models utilizing only clinical characteristics can accurately predict stroke subtypes and EVT eligibility, offering a potential improvement over current pre-hospital scales.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3389/fendo.2024.1353023
Meta-analysis of machine learning models for the diagnosis of central precocious puberty based on clinical, hormonal (laboratory) and imaging data
  • Mar 25, 2024
  • Frontiers in Endocrinology
  • Yilin Chen + 2 more

BackgroundCentral precocious puberty (CPP) is a common endocrine disorder in children, and its diagnosis primarily relies on the gonadotropin-releasing hormone (GnRH) stimulation test, which is expensive and time-consuming. With the widespread application of artificial intelligence in medicine, some studies have utilized clinical, hormonal (laboratory) and imaging data-based machine learning (ML) models to identify CPP. However, the results of these studies varied widely and were challenging to directly compare, mainly due to diverse ML methods. Therefore, the diagnostic value of clinical, hormonal (laboratory) and imaging data-based ML models for CPP remains elusive. The aim of this study was to investigate the diagnostic value of ML models based on clinical, hormonal (laboratory) and imaging data for CPP through a meta-analysis of existing studies.MethodsWe conducted a comprehensive search for relevant English articles on clinical, hormonal (laboratory) and imaging data-based ML models for diagnosing CPP, covering the period from the database creation date to December 2023. Pooled sensitivity, specificity, positive likelihood ratio (LR+), negative likelihood ratio (LR-), summary receiver operating characteristic (SROC) curve, and area under the curve (AUC) were calculated to assess the diagnostic value of clinical, hormonal (laboratory) and imaging data-based ML models for diagnosing CPP. The I2 test was employed to evaluate heterogeneity, and the source of heterogeneity was investigated through meta-regression analysis. Publication bias was assessed using the Deeks funnel plot asymmetry test.ResultsSix studies met the eligibility criteria. The pooled sensitivity and specificity were 0.82 (95% confidence interval (CI) 0.62-0.93) and 0.85 (95% CI 0.80-0.90), respectively. The LR+ was 6.00, and the LR- was 0.21, indicating that clinical, hormonal (laboratory) and imaging data-based ML models exhibited an excellent ability to confirm or exclude CPP. Additionally, the SROC curve showed that the AUC of the clinical, hormonal (laboratory) and imaging data-based ML models in the diagnosis of CPP was 0.90 (95% CI 0.87-0.92), demonstrating good diagnostic value for CPP.ConclusionBased on the outcomes of our meta-analysis, clinical and imaging data-based ML models are excellent diagnostic tools with high sensitivity, specificity, and AUC in the diagnosis of CPP. Despite the geographical limitations of the study findings, future research endeavors will strive to address these issues to enhance their applicability and reliability, providing more precise guidance for the differentiation and treatment of CPP.

  • Research Article
  • Cite Count Icon 1
  • 10.1177/10547738241260947
Machine Learning Predicts Peripherally Inserted Central Catheters-Related Deep Vein Thrombosis Using Patient Features and Catheterization Technology Features.
  • Jul 1, 2024
  • Clinical nursing research
  • Yuan Sheng + 1 more

This study aims to use patient feature and catheterization technology feature variables to train the corresponding machine learning (ML) models to predict peripherally inserted central catheters-deep vein thrombosis (PICCs-DVT) and analyze the importance of the two types of features to PICCs-DVT from the aspect of "input-output" correlation. To comprehensively and systematically summarize the variables used to describe patient features and catheterization technical features, this study combined 18 literature involving the two types of features in predicting PICCs-DVT. A total of 21 variables used to describe the two types of features were summarized, and feature values were extracted from the data of 1,065 PICCs patients from January 1, 2021 to August 31, 2022, to construct a data sample set. Then, 70% of the sample set is used for model training and hyperparameter optimization, and 30% of the sample set is used for PICCs-DVT prediction and feature importance analysis of three common ML classification models (i.e. support vector classifier [SVC], random forest [RF], and artificial neural network [ANN]). In terms of prediction performance, this study selected four metrics to evaluate the prediction performance of the model: precision (P), recall (R), accuracy (ACC), and area under the curve (AUC). In terms of feature importance analysis, this study chooses a single feature analysis method based on the "input-output" sensitivity principle-Permutation Importance. For the mean model performance, the three ML models on the test set are P = 0.92, R = 0.95, ACC = 0.88, and AUC = 0.81. Specifically, the RF model is P = 0.95, R = 0.96, ACC = 0.92, AUC = 0.86; the ANN model is P = 0.92, R = 0.95, ACC = 0.88, AUC = 0.81; the SVC model is P = 0.88, R = 0.94, ACC = 0.85, AUC = 0.77. For feature importance analysis, Catheter-to-vein rate (RF: 91.55%, ANN: 82.25%, SVC: 87.71%), Zubrod-ECOG-WHO score (RF: 66.35%, ANN: 82.25%, SVC: 44.35%), and insertion attempt (RF: 44.35%, ANN: 37.65%, SVC: 65.80%) all occupy the top three in the ML models prediction task of PICCs-DVT, showing relatively consistent ranking results. The ML models show good performance in predicting PICCs-DVT and reveal a relatively consistent ranking of feature importance from the data. The important features revealed might help clinical medical staff to better understand and analyze the formation mechanism of PICCs-DVT from a data-driven perspective.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.