Assessment of myocardial fibrosis using fusion models of echocardiographic radiomics and deep learning: Animal feasibility study

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Assessment of myocardial fibrosis using fusion models of echocardiographic radiomics and deep learning: Animal feasibility study

Similar Papers
  • Research Article
  • 10.1158/1538-7445.am2021-184
Abstract 184: The utility of deep metric learning for breast cancer identification on mammographic images
  • Jul 1, 2021
  • Cancer Research
  • Justin Du + 8 more

Purpose: Although deep learning (DL) models have shown increasing ability to accurately classify diagnostic images in oncology, significantly large amounts of well-curated data are often needed to match human level performance. Given the relative paucity of imaging datasets for less prevalent cancer types, there is an increasing need for methods which can improve the performance of deep learning models trained using limited diagnostic images. Deep metric learning (DML) is a potential method which can improve accuracy in deep learning models trained on limited datasets. Leveraging a triplet-loss function, DML exponentially increases training data compared to a traditional DL model. In this study, we investigated the utility of DML to improve the accuracy of DL models trained to classify cancerous lesions found on screening mammograms. Methods: Using a dataset of 2620 lesions found on routine screening mammogram, we trained both a traditional DL and DML models to classify suspicious lesions as cancerous or benign. The VGG16 architecture was used as the basis for the DL and DML models. Model performance was compared by calculating model accuracy, sensitivity, and specificity on a blinded test set of 378 lesions. In addition to individual model performance, we also measured agreement accuracy when both the DL and DML models were combined. Sub-analyses were conducted to identify phenotypes which were best suited for each model type. Both models underwent hyperparameters optimization to identify ideal batch size, learning rate, and regularization to prevent overfitting. Results: We found that the combination of the traditional DL model with DML model resulted in the highest overall accuracy (78.7%) representing a 7.1% improvement compared to the traditional DL model (p<.001). Alone, the traditional DL model had an improved accuracy compared to the DML model (71.4% vs 66.4%). The traditional DL model had a higher sensitivity (94.8% vs 73.6 %) , but lower specificity (34.7% vs 55.1%) compared the DML model. Sub-analyses suggested the traditional DL model was more accurate on higher density breasts, whereas the DML model was more accurate on lower density breasts. Additionally, the traditional DL model had the highest accuracy on oval shaped lesions, compared to the DML model which was most accurate on irregularly shaped breast lesions. Conclusion: Our study suggests that addition of DML models with traditional DL models can improve diagnostic image classification performance in cancer. Our results suggest DML models may provide increased specificity and help with classification of unique populations often misclassified by traditional DL models. Further studied investigating the utility of DML on other cancer imaging tasks are necessary to successfully build more robust DL models in cancer imaging. Citation Format: Justin Du, Sachin Umrao, Enoch Chang, Marina Joel, Aidan Gilson, Guneet Janda, Rachel Choi, Yongfeng Hui, Sanjay Aneja. The utility of deep metric learning for breast cancer identification on mammographic images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 184.

  • Research Article
  • Cite Count Icon 8
  • 10.1016/j.ijcard.2018.11.107
Diffuse myocardial fibrosis in adolescents operated with arterial switch for transposition of the great arteries - A CMR study
  • Nov 22, 2018
  • International Journal of Cardiology
  • K.R Suther + 9 more

Diffuse myocardial fibrosis in adolescents operated with arterial switch for transposition of the great arteries - A CMR study

  • Research Article
  • 10.1186/s12885-025-14971-7
Deep multi-instance learning model based on gadoxetic acid-enhanced MRI for predicting microvascular invasion of hepatocellular carcinoma: a multicenter, retrospective study
  • Oct 22, 2025
  • BMC Cancer
  • Yi Luo + 7 more

ObjectiveMicrovascular invasion (MVI) is of great significance for the individualized treatment of hepatocellular carcinoma (HCC) and preoperative noninvasive prediction of MVI is still an urgent clinical problem. To explore the effects of different regions of interest (ROI) and image input dimensions on the performance of deep learning (DL) models, and to select the best result to develop and validate a DL model for preoperative prediction of MVI.Materials and methodsA total of 206 patients with pathologically confirmed HCC from three hospitals were retrospectively enrolled and divided into training, internal validation and external test set. Based on hepatobiliary phase images (HBP) of gadoxetic acid-enhanced MRI, 2D DL, 3D DL and 2.5D deep multi-instance learning (MIL) models were established. The receiver operating characteristic curve (ROC) was used to evaluate the predictive efficacy of the above models. Based on the optimal performance model, the T1WI-FS and T2WI-FS images were preprocessed correspondingly, and a multimodal prediction model including three sequences was constructed. The ROC, and decision curve were used to visualize the predictive ability of the model.ResultsCompared with 2D DL and 3D DL models, the 2.5D DL model based on all axial images of ROI had the highest performance, with the AUC values of 0.802 (95% CI, 0.669–0.936) and 0.759 (95% CI, 0.643–0.875) in the validation and test sets. The AUCs of the multimodal MRI model were 0.954 (95% CI, 0.920–0.989) in the training set, 0.857 (95% CI, 0.736–0.978) in the validation set, and 0.788 (95% CI, 0.681–0.895) in the test set.ConclusionThe DL model that selects all axial slices of intratumor and peritumor as input shows robust capability in predicting MVI, which is expected to help clinical decision-making of individualized treatment for HCC.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12885-025-14971-7.

  • Research Article
  • Cite Count Icon 102
  • 10.1016/j.crad.2016.02.013
Assessment of myocardial fibrosis with T1 mapping MRI
  • Mar 19, 2016
  • Clinical Radiology
  • R.J Everett + 5 more

Assessment of myocardial fibrosis with T1 mapping MRI

  • Research Article
  • Cite Count Icon 17
  • 10.1016/j.crad.2013.11.012
The emerging role of cardiovascular MRI for risk stratification in hypertrophic cardiomyopathy
  • Jan 10, 2014
  • Clinical Radiology
  • E.T.D Hoey + 5 more

The emerging role of cardiovascular MRI for risk stratification in hypertrophic cardiomyopathy

  • Research Article
  • Cite Count Icon 24
  • 10.1136/heartjnl-2019-315211
Role of advanced left ventricular imaging in adults with aortic stenosis
  • Jun 11, 2020
  • Heart
  • Andreea Calin + 5 more

This review focuses on the available data regarding the utility of advanced left ventricular (LV) imaging in aortic stenosis (AS) and its potential impact for optimising the timing of aortic...

  • Research Article
  • Cite Count Icon 39
  • 10.1007/s00330-021-08195-z
A deep learning-machine learning fusion approach for the classification of benign, malignant, and intermediate bone tumors.
  • Aug 25, 2021
  • European Radiology
  • Renyi Liu + 10 more

To build and validate deep learning and machine learning fusion models to classify benign, malignant, and intermediate bone tumors based on patient clinical characteristics and conventional radiographs of the lesion. In this retrospective study, data were collected with pathologically confirmed diagnoses of bone tumors between 2012 and 2019. Deep learning and machine learning fusion models were built to classify tumors as benign, malignant, or intermediate using conventional radiographs of the lesion and potentially relevant clinical data. Five radiologists compared diagnostic performance with and without the model. Diagnostic performance was evaluated using the area under the curve (AUC). A total of 643 patients' (median age, 21 years; interquartile range, 12-38 years; 244 women) 982 radiographs were included. In the test set, the binary category classification task, the radiological model of classification for benign/not benign, malignant/nonmalignant, and intermediate/not intermediate had AUCs of 0.846, 0.827, and 0.820, respectively; the fusion models had an AUC of 0.898, 0.894, and 0.865, respectively. In the three-category classification task, the radiological model achieved a macro average AUC of 0.813, and the fusion model had a macro average AUC of 0.872. In the observation test, the mean macro average AUC of all radiologists was 0.819. With the three-category classification fusion model support, the macro AUC improved by 0.026. We built, validated, and tested deep learning and machine learning models that classified bone tumors at a level comparable with that of senior radiologists. Model assistance may somewhat help radiologists' differential diagnoses of bone tumors. • The deep learning model can be used to classify benign, malignant, and intermediate bone tumors. • The machine learning model fusing information from radiographs and clinical characteristics can improve the classification capacity for bone tumors. • The diagnostic performance of the fusion model is comparable with that of senior radiologists and is potentially useful as a complement to radiologists in a bone tumor differential diagnosis.

  • Research Article
  • 10.1093/humrep/deab130.259
P–260 Towards better explainable deep learning models for embryo selection in ART
  • Aug 6, 2021
  • Human Reproduction
  • Ashu Sharma + 4 more

Study question Can heatmaps generated by occlusion explain the patterns learned by deep learning (DL) models classifying the embryo viability in ART? Summary answer Occlusion experiments generate heatmaps that reveal which regions in frames of time-lapse video (TLV) are more discriminative for classification and prediction by the DL models. What is known already DL has widely been explored in ART for embryo selection. Depending upon input (video or image), different DL models classifying embryo viability are developed. However, whether the prediction is based on actual input features or random guessing is unknown. The embryo selection in ART is subjective. If the intention is using DL models’ prediction to transfer, freeze or discard the embryo, explanations of how they interpret embryonic development features brings transparency and trust. In other areas, heatmaps are used for explaining DL predictions. The heatmaps can be a tool to understand patterns learned by DL models for embryo selection. Study design, size, duration We trained two separate DL models for predicting the presence of fetal heartbeat for the transferred embryos. We further used occlusion generated heatmaps to explain the predictions. For training, retrospective data was used. The input dataset consisted of 136 TLVs and corresponding patient data for 132 participants (128: single embryo transfers and 8: double embryo transfer) from both IVF and ICSI treatment. Each video was assessed by an embryologist. Participants/materials, setting, methods DL models (A as ResNet–18, B as VGG16) are trained for predicting the presence of fetal heartbeat on a single frame extracted from TLV after day three or later. Model A has a better recall (0.7) compared to B (0.5). Heatmaps explain the reason behind models’ recall rate by visually representing patterns learned by them. Using occlusion filter size 30*30 with stride 14 and size 50*50 with stride 25, we generate heatmaps for both models. Main results and the role of chance The heatmaps generated using occlusion can represent visually the patterns discovered by the DL models when predicting the presence of a fetal heartbeat. Using occlusion filter size 30*30 with stride 14, we verified that Model B has lower recall because the heatmaps show that the model finds redundant features present outside the embryo region in many input frames. It could be interpreted that either the model has not learned relevant patterns or is more robust to noise. This representation of DL models equips us in better decision-making, whether to consider or discard the prediction or rather train the model further, preprocess training data or change network architecture. The heatmaps revealed that for frames where significant patterns learned by the models are within the embryo region, more weight was given to specific features like the inner cell mass, trophectoderm and some parts within the zona pellucida. Moreover, the heat maps generated using occlusion are independent of the underlying model’s architecture as the same experiment settings were used for both models. For occlusion filter size 50*50 with stride 25, the expanse of input regions (in or outside the embryo) considered relevant could be visualized for both models A and B. Limitations, reasons for caution Heatmaps generated by occluding input regions give a visual representation of features in individual frames not directly on videos. Explaining DL models by heatmaps besides occlusion, other techniques (Grad-Cam) exist but were not evaluated. Furthermore, there is no quantitative measure for evaluating whether heatmaps are a good explanation or not. Wider implications of the findings: The heatmaps make the patterns discovered by DL models visually recognized and bring forth the prominent portions of embryo regions. This will again improve understanding and trust in DL models’ predictions. Visual representation of DL models using heatmaps enables interpreting a prediction, performing model analysis and determining scope for improvement. Trial registration number Not applicable

  • Research Article
  • 10.1093/humrep/deaf097.481
P-172 Retrospective comparison of deep learning versus logistic regression for selecting the best embryo for transfer
  • Jun 1, 2025
  • Human Reproduction
  • T Vermilyea + 1 more

Study question How does the performance of a deep learning model compare to a logistic regression model for embryo ranking? Summary answer Top-ranked embryos via both deep learning and logistic regression showed higher pregnancy rates, but deep learning showed a greater improvement in pregnancy success. What is known already Artificial intelligence (AI) algorithms are now being utilized for embryo selection in IVF. However, there remains a debate about whether interpretable machine learning models are more suitable versus deep learning models, which are often considered “black-box” due to their complexity. Two previously developed models were compared in this study: 1) A deep learning model which utilizes CNNs to automatically analyze a static image of an embryo, and 2) A logistic regression model that incorporates embryo morphology grade (ie 5AB), embryo day (5, 6, or 7), and patient age. Both models were developed using 10,000+ embryo images from 11 U.S. clinics. Study design, size, duration A total of 4543 images and morphology grades of individual embryos were collected prospectively from 870 patients at two U.S. clinics using an embryo image capture software, from January - December 2024. Of these, 406 embryos were transferred. 90% of the transferred embryos were genetically tested. None of these embryos were used for training or testing either AI models. Participants/materials, setting, methods After removing aneuploid embryos, embryos were ranked within each patient’s cohort using both the deep learning and logistic regression models. We then compared pregnancy rates of embryos that were top-ranked in their cohort versus those that were lower-ranked. To reduce bias, we included only patients with multiple viable embryos to choose from and only considered first transfers. Differences in biochemical pregnancy and fetal heartbeat were compared for both approaches. Main results and the role of chance Retrospectively, the top-ranked deep learning embryo was transferred 43% of the time, whereas the top-ranked logistic regression embryo was transferred 76% of the time. Transferring the top-ranked embryo by deep learning was associated with an 8.9% higher pregnancy rate (76.1% vs. 67.2%, p = 0.08) and a 6.2% higher fetal heartbeat (60.0% vs 53.8%, p = 0.38). For logistic regression, the top-ranked embryo selection was associated with a 4.1% higher pregnancy rate (71.3% vs. 67.2%, p = 0.51) and a 4.1% higher fetal heartbeat (56.8% vs 52.7%, p = 0.45). P-values were >0.05 for all comparisons, indicating statistical non-significance. For all comparisons, there were no statistical or clinical differences in the average age of the patients between the two groups, nor were there differences in the average AI score of the top-ranked embryo in the cohort, suggesting that these comparisons did not introduce significant biases. Limitations, reasons for caution As this was a retrospective study, clinical decision making about which embryo to transfer was not influenced by either model rankings. The dataset was limited to two clinics, so further prospective validation is needed. Wider implications of the findings Both deep learning and logistic regression models show promise for selecting the top ranked embryo in a patient’s cohort. The simplicity and interpretability of the logistic regression may allow for faster adoption and clinical trust, while deep learning may further enhance success rates. Trial registration number No

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3390/electronics13101996
Enhanced Sequence-to-Sequence Deep Transfer Learning for Day-Ahead Electricity Load Forecasting
  • May 20, 2024
  • Electronics
  • Vasileios Laitsos + 4 more

Electricity load forecasting is a crucial undertaking within all the deregulated markets globally. Among the research challenges on a global scale, the investigation of deep transfer learning (DTL) in the field of electricity load forecasting represents a fundamental effort that can inform artificial intelligence applications in general. In this paper, a comprehensive study is reported regarding day-ahead electricity load forecasting. For this purpose, three sequence-to-sequence (Seq2seq) deep learning (DL) models are used, namely the multilayer perceptron (MLP), the convolutional neural network (CNN) and the ensemble learning model (ELM), which consists of the weighted combination of the outputs of MLP and CNN models. Also, the study focuses on the development of different forecasting strategies based on DTL, emphasizing the way the datasets are trained and fine-tuned for higher forecasting accuracy. In order to implement the forecasting strategies using deep learning models, load datasets from three Greek islands, Rhodes, Lesvos, and Chios, are used. The main purpose is to apply DTL for day-ahead predictions (1–24 h) for each month of the year for the Chios dataset after training and fine-tuning the models using the datasets of the three islands in various combinations. Four DTL strategies are illustrated. In the first strategy (DTL Case 1), each of the three DL models is trained using only the Lesvos dataset, while fine-tuning is performed on the dataset of Chios island, in order to create day-ahead predictions for the Chios load. In the second strategy (DTL Case 2), data from both Lesvos and Rhodes concurrently are used for the DL model training period, and fine-tuning is performed on the data from Chios. The third DTL strategy (DTL Case 3) involves the training of the DL models using the Lesvos dataset, and the testing period is performed directly on the Chios dataset without fine-tuning. The fourth strategy is a multi-task deep learning (MTDL) approach, which has been extensively studied in recent years. In MTDL, the three DL models are trained simultaneously on all three datasets and the final predictions are made on the unknown part of the dataset of Chios. The results obtained demonstrate that DTL can be applied with high efficiency for day-ahead load forecasting. Specifically, DTL Case 1 and 2 outperformed MTDL in terms of load prediction accuracy. Regarding the DL models, all three exhibit very high prediction accuracy, especially in the two cases with fine-tuning. The ELM excels compared to the single models. More specifically, for conducting day-ahead predictions, it is concluded that the MLP model presents the best monthly forecasts with MAPE values of 6.24% and 6.01% for the first two cases, the CNN model presents the best monthly forecasts with MAPE values of 5.57% and 5.60%, respectively, and the ELM model achieves the best monthly forecasts with MAPE values of 5.29% and 5.31%, respectively, indicating the very high accuracy it can achieve.

  • Research Article
  • Cite Count Icon 28
  • 10.1111/1365-2478.13097
Learning from unlabelled real seismic data: Fault detection based on transfer learning
  • Jun 6, 2021
  • Geophysical Prospecting
  • Ruoshui Zhou + 3 more

ABSTRACTSignificant advances have been made towards fault detection using deep learning. However, the fault labelling of seismic data requires great human effort. The resulting small sample problem makes traditional deep learning methods difficult to achieve desired results. Existing research proposes to train a deep learning model with labelled synthetic seismic data to get good fault detection results. However, due to the complexity of the actual geological situation, there are inevitable differences between synthetic seismic data and real seismic data in many aspects such as seismic signal frequency, frequency of fault distribution and degree of noise disturbance, which lead to the fact that the deep learning model trained by synthetic seismic data is difficult to get good fault detection result in field data applications. We propose to use transfer learning to reduce the impact of data differences to solve this problem: part of the deep transfer learning model is used to learn fault‐related features. And the other part of the deep transfer learning model is used to mine common features between the real seismic data and the synthetic seismic data, which makes the deep transfer learning model more suitable for real seismic data. Compared with the latest research progress, our method can greatly improve the effect of fault detection without real data label, which can significantly save the cost of manual label processing.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.1038/s41598-024-66481-4
Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models
  • Jul 8, 2024
  • Scientific Reports
  • Khadijeh Moulaei + 14 more

The need for intubation in methanol-poisoned patients, if not predicted in time, can lead to irreparable complications and even death. Artificial intelligence (AI) techniques like machine learning (ML) and deep learning (DL) greatly aid in accurately predicting intubation needs for methanol-poisoned patients. So, our study aims to assess Explainable Artificial Intelligence (XAI) for predicting intubation necessity in methanol-poisoned patients, comparing deep learning and machine learning models. This study analyzed a dataset of 897 patient records from Loghman Hakim Hospital in Tehran, Iran, encompassing cases of methanol poisoning, including those requiring intubation (202 cases) and those not requiring it (695 cases). Eight established ML (SVM, XGB, DT, RF) and DL (DNN, FNN, LSTM, CNN) models were used. Techniques such as tenfold cross-validation and hyperparameter tuning were applied to prevent overfitting. The study also focused on interpretability through SHAP and LIME methods. Model performance was evaluated based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior performance in accuracy (94.0%), sensitivity (99.0%), specificity (94.0%), and F1-score (97.0%). CNN led in ROC with 78.0%. For ML models, RF excelled in accuracy (97.0%) and specificity (100%), followed by XGB with sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%). Overall, RF and XGB outperformed other models, with accuracy (97.0%) and specificity (100%) for RF, and sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%) for XGB. ML models surpassed DL models across all metrics, with accuracies from 93.0% to 97.0% for DL and 93.0% to 99.0% for ML. Sensitivities ranged from 98.0% to 99.37% for DL and 93.0% to 99.0% for ML. DL models achieved specificities from 78.0% to 94.0%, while ML models ranged from 93.0% to 100%. F1-scores for DL were between 93.0% and 97.0%, and for ML between 96.0% and 98.27%. DL models scored ROC between 68.0% and 78.0%, while ML models ranged from 84.0% to 96.08%. Key features for predicting intubation necessity include GCS at admission, ICU admission, age, longer folic acid therapy duration, elevated BUN and AST levels, VBG_HCO3 at initial record, and hemodialysis presence. This study as the showcases XAI's effectiveness in predicting intubation necessity in methanol-poisoned patients. ML models, particularly RF and XGB, outperform DL counterparts, underscoring their potential for clinical decision-making.

  • Research Article
  • Cite Count Icon 23
  • 10.1007/s11356-021-13503-7
Spatial modelling of soil salinity: deep or shallow learning models?
  • Mar 23, 2021
  • Environmental Science and Pollution Research
  • Aliakbar Mohammadifar + 3 more

Understanding the spatial distribution of soil salinity is required to conserve land against degradation and desertification. Against this background, this study is the first attempt to predict soil salinity in the Jaghin basin, in southern Iran, by applying and comparing the performance of four deep learning (DL) models (deep convolutional neural networks-DCNNs, dense connected deep neural networks-DenseDNNs, recurrent neural networks-long short-term memory-RNN-LSTM and recurrent neural networks-gated recurrent unit-RNN-GRU) and six shallow machine learning (ML) models (bagged classification and regression tree-BCART, cforest, cubist, quantile regression with LASSO penalty-QR-LASSO, ridge regression-RR and support vectore machine-SVM). To do this, 49 environmental landsat8-derived variables including digital elevation model (DEM)-extracted covariates, soil-salinity indices, and other variables (e.g., soil order, lithology, land use) were mapped spatially. For assessing the relationships between soil salinity (EC) and factors controlling EC, we collected 319 surficial (0-5 cm depth) soil samples for measuring soil salinity on the basis of electrical conductivity (EC). We then selected the most important features (covariates) controlling soil salinity by applying a MARS model. The performance of the DL and shallow ML models for generating soil salinity spatial maps (SSSMs) was assessed using a Taylor diagram and the Nash Sutcliff coefficient (NSE). Among all 10 predictive models, DL models with NSE ≥ 0.9 (DCNNs was the most accurate model with NSE = 0.96) were selected as the four best models, and performed better than the six shallow ML models with NSE ≤ 0.83 (QR-LASSO was the weakest predictive model with NSE = 0.50). Based on DCNNs-, the values of the EC ranged between 0.67 and 14.73 dS/m, whereas for QR-LASSO the corresponding EC values were 0.37 to 19.6 dS/m. Overall, DL models performed better than shallow ML models for production of the SSSMs and therefore we recommend applying DL models for prediction purposes in environmental sciences.

  • Research Article
  • Cite Count Icon 2
  • 10.1088/1361-6560/ad953e
BD-StableNet: a deep stable learning model with an automatic lesion area detection function for predicting malignancy in BI-RADS category 3–4A lesions
  • Dec 3, 2024
  • Physics in Medicine & Biology
  • Hui Qu + 8 more

The latest developments combining deep learning technology and medical image data have attracted wide attention and provide efficient noninvasive methods for the early diagnosis of breast cancer. The success of this task often depends on a large amount of data annotated by medical experts, which is time-consuming and may not always be feasible in the biomedical field. The lack of interpretability has greatly hindered the application of deep learning in the medical field. Currently, deep stable learning, including causal inference, make deep learning models more predictive and interpretable. In this study, to distinguish malignant tumors in Breast Imaging-Reporting and Data System (BI-RADS) category 3-4A breast lesions, we propose BD-StableNet, a deep stable learning model for the automatic detection of lesion areas. In this retrospective study, we collected 3103 breast ultrasound images (1418 benign and 1685 malignant lesions) from 493 patients (361 benign and 132 malignant lesion patients) for model training and testing. Compared with other mainstream deep learning models, BD-StableNet has better prediction performance (accuracy = 0.952, area under the curve = 0.982, precision = 0.970, recall = 0.941,F1-score = 0.955 and specificity = 0.965). The lesion area prediction and class activation map results both verify that our proposed model is highly interpretable. The results indicate that BD-StableNet significantly enhances diagnostic accuracy and interpretability, offering a promising noninvasive approach for the diagnosis of BI-RADS category 3-4A breast lesions. Clinically, the use of BD-StableNet could reduce unnecessary biopsies, improve diagnostic efficiency, and ultimately enhance patient outcomes by providing more precise and reliable assessments of breast lesions.

  • Research Article
  • Cite Count Icon 9
  • 10.1038/s41598-024-82931-5
Explainable artificial intelligence for stroke prediction through comparison of deep learning and machine learning models
  • Dec 28, 2024
  • Scientific Reports
  • Khadijeh Moulaei + 5 more

Failure to predict stroke promptly may lead to delayed treatment, causing severe consequences like permanent neurological damage or death. Early detection using deep learning (DL) and machine learning (ML) models can enhance patient outcomes and mitigate the long-term effects of strokes. The aim of this study is to compare these models, exploring their efficacy in predicting stroke. This study analyzed a dataset comprising 663 records from patients hospitalized at Hazrat Rasool Akram Hospital in Tehran, Iran, including 401 healthy individuals and 262 stroke patients. A total of eight established ML (SVM, XGB, KNN, RF) and DL (DNN, FNN, LSTM, CNN) models were utilized to predict stroke. Techniques such as 10-fold cross-validation and hyperparameter tuning were implemented to prevent overfitting. The study also focused on interpretability through Shapley Additive Explanations (SHAP). The evaluation of model’s performance was based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior sensitivity at 96.15%, while FNN exhibited better specificity (96.0%), accuracy (96.0%), F1-score (95.0%), and ROC (98.0%) among DL models. For ML models, RF displayed higher sensitivity (99.9%), accuracy (99.0%), specificity (100%), F1-score (99.0%), and ROC (99.9%). Overall, RF outperformed all models, while DL models surpassed ML models in most metrics except for RF. DL models (CNN, LSTM, DNN, FNN) achieved sensitivities from 93.0 to 96.15%, specificities from 80.0 to 96.0%, accuracies from 92.0 to 96.0%, F1-scores from 87.34 to 95.0%, and ROC scores from 95.0 to 98.0%. In contrast, ML models (KNN, XGB, SVM) showed sensitivities between 29.0% and 94.0%, specificities between 89.47% and 96.0%, accuracies between 71.0% and 95.0%, F1-scores between 44.0% and 95.0%, and ROC scores between 64.0% and 95.0%. This study demonstrates the efficacy of DL and ML models in predicting stroke, with the RF models outperforming all others in key metrics. While DL models generally surpassed ML models, RF’s exceptional performance highlights the potential of combining these technologies for early stroke detection, significantly improving patient outcomes by preventing severe consequences like permanent neurological damage or death.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.