Classification of piano performers with deep learning models

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Classification of piano performers with deep learning models

Similar Papers
  • Research Article
  • Cite Count Icon 16
  • 10.1038/s41598-024-82931-5
Explainable artificial intelligence for stroke prediction through comparison of deep learning and machine learning models
  • Dec 28, 2024
  • Scientific Reports
  • Khadijeh Moulaei + 5 more

Failure to predict stroke promptly may lead to delayed treatment, causing severe consequences like permanent neurological damage or death. Early detection using deep learning (DL) and machine learning (ML) models can enhance patient outcomes and mitigate the long-term effects of strokes. The aim of this study is to compare these models, exploring their efficacy in predicting stroke. This study analyzed a dataset comprising 663 records from patients hospitalized at Hazrat Rasool Akram Hospital in Tehran, Iran, including 401 healthy individuals and 262 stroke patients. A total of eight established ML (SVM, XGB, KNN, RF) and DL (DNN, FNN, LSTM, CNN) models were utilized to predict stroke. Techniques such as 10-fold cross-validation and hyperparameter tuning were implemented to prevent overfitting. The study also focused on interpretability through Shapley Additive Explanations (SHAP). The evaluation of model’s performance was based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior sensitivity at 96.15%, while FNN exhibited better specificity (96.0%), accuracy (96.0%), F1-score (95.0%), and ROC (98.0%) among DL models. For ML models, RF displayed higher sensitivity (99.9%), accuracy (99.0%), specificity (100%), F1-score (99.0%), and ROC (99.9%). Overall, RF outperformed all models, while DL models surpassed ML models in most metrics except for RF. DL models (CNN, LSTM, DNN, FNN) achieved sensitivities from 93.0 to 96.15%, specificities from 80.0 to 96.0%, accuracies from 92.0 to 96.0%, F1-scores from 87.34 to 95.0%, and ROC scores from 95.0 to 98.0%. In contrast, ML models (KNN, XGB, SVM) showed sensitivities between 29.0% and 94.0%, specificities between 89.47% and 96.0%, accuracies between 71.0% and 95.0%, F1-scores between 44.0% and 95.0%, and ROC scores between 64.0% and 95.0%. This study demonstrates the efficacy of DL and ML models in predicting stroke, with the RF models outperforming all others in key metrics. While DL models generally surpassed ML models, RF’s exceptional performance highlights the potential of combining these technologies for early stroke detection, significantly improving patient outcomes by preventing severe consequences like permanent neurological damage or death.

  • Research Article
  • 10.1158/1538-7445.am2021-184
Abstract 184: The utility of deep metric learning for breast cancer identification on mammographic images
  • Jul 1, 2021
  • Cancer Research
  • Justin Du + 8 more

Purpose: Although deep learning (DL) models have shown increasing ability to accurately classify diagnostic images in oncology, significantly large amounts of well-curated data are often needed to match human level performance. Given the relative paucity of imaging datasets for less prevalent cancer types, there is an increasing need for methods which can improve the performance of deep learning models trained using limited diagnostic images. Deep metric learning (DML) is a potential method which can improve accuracy in deep learning models trained on limited datasets. Leveraging a triplet-loss function, DML exponentially increases training data compared to a traditional DL model. In this study, we investigated the utility of DML to improve the accuracy of DL models trained to classify cancerous lesions found on screening mammograms. Methods: Using a dataset of 2620 lesions found on routine screening mammogram, we trained both a traditional DL and DML models to classify suspicious lesions as cancerous or benign. The VGG16 architecture was used as the basis for the DL and DML models. Model performance was compared by calculating model accuracy, sensitivity, and specificity on a blinded test set of 378 lesions. In addition to individual model performance, we also measured agreement accuracy when both the DL and DML models were combined. Sub-analyses were conducted to identify phenotypes which were best suited for each model type. Both models underwent hyperparameters optimization to identify ideal batch size, learning rate, and regularization to prevent overfitting. Results: We found that the combination of the traditional DL model with DML model resulted in the highest overall accuracy (78.7%) representing a 7.1% improvement compared to the traditional DL model (p<.001). Alone, the traditional DL model had an improved accuracy compared to the DML model (71.4% vs 66.4%). The traditional DL model had a higher sensitivity (94.8% vs 73.6 %) , but lower specificity (34.7% vs 55.1%) compared the DML model. Sub-analyses suggested the traditional DL model was more accurate on higher density breasts, whereas the DML model was more accurate on lower density breasts. Additionally, the traditional DL model had the highest accuracy on oval shaped lesions, compared to the DML model which was most accurate on irregularly shaped breast lesions. Conclusion: Our study suggests that addition of DML models with traditional DL models can improve diagnostic image classification performance in cancer. Our results suggest DML models may provide increased specificity and help with classification of unique populations often misclassified by traditional DL models. Further studied investigating the utility of DML on other cancer imaging tasks are necessary to successfully build more robust DL models in cancer imaging. Citation Format: Justin Du, Sachin Umrao, Enoch Chang, Marina Joel, Aidan Gilson, Guneet Janda, Rachel Choi, Yongfeng Hui, Sanjay Aneja. The utility of deep metric learning for breast cancer identification on mammographic images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 184.

  • Research Article
  • 10.1093/humrep/deab130.259
P–260 Towards better explainable deep learning models for embryo selection in ART
  • Aug 6, 2021
  • Human Reproduction
  • Ashu Sharma + 4 more

Study question Can heatmaps generated by occlusion explain the patterns learned by deep learning (DL) models classifying the embryo viability in ART? Summary answer Occlusion experiments generate heatmaps that reveal which regions in frames of time-lapse video (TLV) are more discriminative for classification and prediction by the DL models. What is known already DL has widely been explored in ART for embryo selection. Depending upon input (video or image), different DL models classifying embryo viability are developed. However, whether the prediction is based on actual input features or random guessing is unknown. The embryo selection in ART is subjective. If the intention is using DL models’ prediction to transfer, freeze or discard the embryo, explanations of how they interpret embryonic development features brings transparency and trust. In other areas, heatmaps are used for explaining DL predictions. The heatmaps can be a tool to understand patterns learned by DL models for embryo selection. Study design, size, duration We trained two separate DL models for predicting the presence of fetal heartbeat for the transferred embryos. We further used occlusion generated heatmaps to explain the predictions. For training, retrospective data was used. The input dataset consisted of 136 TLVs and corresponding patient data for 132 participants (128: single embryo transfers and 8: double embryo transfer) from both IVF and ICSI treatment. Each video was assessed by an embryologist. Participants/materials, setting, methods DL models (A as ResNet–18, B as VGG16) are trained for predicting the presence of fetal heartbeat on a single frame extracted from TLV after day three or later. Model A has a better recall (0.7) compared to B (0.5). Heatmaps explain the reason behind models’ recall rate by visually representing patterns learned by them. Using occlusion filter size 30*30 with stride 14 and size 50*50 with stride 25, we generate heatmaps for both models. Main results and the role of chance The heatmaps generated using occlusion can represent visually the patterns discovered by the DL models when predicting the presence of a fetal heartbeat. Using occlusion filter size 30*30 with stride 14, we verified that Model B has lower recall because the heatmaps show that the model finds redundant features present outside the embryo region in many input frames. It could be interpreted that either the model has not learned relevant patterns or is more robust to noise. This representation of DL models equips us in better decision-making, whether to consider or discard the prediction or rather train the model further, preprocess training data or change network architecture. The heatmaps revealed that for frames where significant patterns learned by the models are within the embryo region, more weight was given to specific features like the inner cell mass, trophectoderm and some parts within the zona pellucida. Moreover, the heat maps generated using occlusion are independent of the underlying model’s architecture as the same experiment settings were used for both models. For occlusion filter size 50*50 with stride 25, the expanse of input regions (in or outside the embryo) considered relevant could be visualized for both models A and B. Limitations, reasons for caution Heatmaps generated by occluding input regions give a visual representation of features in individual frames not directly on videos. Explaining DL models by heatmaps besides occlusion, other techniques (Grad-Cam) exist but were not evaluated. Furthermore, there is no quantitative measure for evaluating whether heatmaps are a good explanation or not. Wider implications of the findings: The heatmaps make the patterns discovered by DL models visually recognized and bring forth the prominent portions of embryo regions. This will again improve understanding and trust in DL models’ predictions. Visual representation of DL models using heatmaps enables interpreting a prediction, performing model analysis and determining scope for improvement. Trial registration number Not applicable

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.1038/s41598-024-66481-4
Explainable artificial intelligence (XAI) for predicting the need for intubation in methanol-poisoned patients: a study comparing deep and machine learning models
  • Jul 8, 2024
  • Scientific Reports
  • Khadijeh Moulaei + 14 more

The need for intubation in methanol-poisoned patients, if not predicted in time, can lead to irreparable complications and even death. Artificial intelligence (AI) techniques like machine learning (ML) and deep learning (DL) greatly aid in accurately predicting intubation needs for methanol-poisoned patients. So, our study aims to assess Explainable Artificial Intelligence (XAI) for predicting intubation necessity in methanol-poisoned patients, comparing deep learning and machine learning models. This study analyzed a dataset of 897 patient records from Loghman Hakim Hospital in Tehran, Iran, encompassing cases of methanol poisoning, including those requiring intubation (202 cases) and those not requiring it (695 cases). Eight established ML (SVM, XGB, DT, RF) and DL (DNN, FNN, LSTM, CNN) models were used. Techniques such as tenfold cross-validation and hyperparameter tuning were applied to prevent overfitting. The study also focused on interpretability through SHAP and LIME methods. Model performance was evaluated based on accuracy, specificity, sensitivity, F1-score, and ROC curve metrics. Among DL models, LSTM showed superior performance in accuracy (94.0%), sensitivity (99.0%), specificity (94.0%), and F1-score (97.0%). CNN led in ROC with 78.0%. For ML models, RF excelled in accuracy (97.0%) and specificity (100%), followed by XGB with sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%). Overall, RF and XGB outperformed other models, with accuracy (97.0%) and specificity (100%) for RF, and sensitivity (99.37%), F1-score (98.27%), and ROC (96.08%) for XGB. ML models surpassed DL models across all metrics, with accuracies from 93.0% to 97.0% for DL and 93.0% to 99.0% for ML. Sensitivities ranged from 98.0% to 99.37% for DL and 93.0% to 99.0% for ML. DL models achieved specificities from 78.0% to 94.0%, while ML models ranged from 93.0% to 100%. F1-scores for DL were between 93.0% and 97.0%, and for ML between 96.0% and 98.27%. DL models scored ROC between 68.0% and 78.0%, while ML models ranged from 84.0% to 96.08%. Key features for predicting intubation necessity include GCS at admission, ICU admission, age, longer folic acid therapy duration, elevated BUN and AST levels, VBG_HCO3 at initial record, and hemodialysis presence. This study as the showcases XAI's effectiveness in predicting intubation necessity in methanol-poisoned patients. ML models, particularly RF and XGB, outperform DL counterparts, underscoring their potential for clinical decision-making.

  • Research Article
  • Cite Count Icon 9
  • 10.1186/s40644-024-00790-9
Cross-institutional evaluation of deep learning and radiomics models in predicting microvascular invasion in hepatocellular carcinoma: validity, robustness, and ultrasound modality efficacy comparison
  • Oct 22, 2024
  • Cancer Imaging
  • Weibin Zhang + 7 more

PurposeTo conduct a head-to-head comparison between deep learning (DL) and radiomics models across institutions for predicting microvascular invasion (MVI) in hepatocellular carcinoma (HCC) and to investigate the model robustness and generalizability through rigorous internal and external validation.MethodsThis retrospective study included 2304 preoperative images of 576 HCC lesions from two centers, with MVI status determined by postoperative histopathology. We developed DL and radiomics models for predicting the presence of MVI using B-mode ultrasound, contrast-enhanced ultrasound (CEUS) at the arterial, portal, and delayed phases, and a combined modality (B + CEUS). For radiomics, we constructed models with enlarged vs. original regions of interest (ROIs). A cross-validation approach was performed by training models on one center’s dataset and validating the other, and vice versa. This allowed assessment of the validity of different ultrasound modalities and the cross-center robustness of the models. The optimal model combined with alpha-fetoprotein (AFP) was also validated. The head-to-head comparison was based on the area under the receiver operating characteristic curve (AUC).ResultsThirteen DL models and 25 radiomics models using different ultrasound modalities were constructed and compared. B + CEUS was the optimal modality for both DL and radiomics models. The DL model achieved AUCs of 0.802–0.818 internally and 0.667–0.688 externally across the two centers, whereas radiomics achieved AUCs of 0.749–0.869 internally and 0.646–0.697 externally. The radiomics models showed overall improvement with enlarged ROIs (P < 0.05 for both CEUS and B + CEUS modalities). The DL models showed good cross-institutional robustness (P > 0.05 for all modalities, 1.6–2.1% differences in AUC for the optimal modality), whereas the radiomics models had relatively limited robustness across the two centers (12% drop-off in AUC for the optimal modality). Adding AFP improved the DL models (P < 0.05 externally) and well maintained the robustness, but did not benefit the radiomics model (P > 0.05).ConclusionCross-institutional validation indicated that DL demonstrated better robustness than radiomics for preoperative MVI prediction in patients with HCC, representing a promising solution to non-standardized ultrasound examination procedures.

  • Research Article
  • Cite Count Icon 1
  • 10.1093/biomethods/bpae097
Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts.
  • Jan 6, 2025
  • Biology methods & protocols
  • Xiangyun Qiu

A mixture-of-experts (MoE) approach has been developed to mitigate the poor out-of-distribution (OOD) generalization of deep learning (DL) models for single-sequence-based prediction of RNA secondary structure. The main idea behind this approach is to use DL models for in-distribution (ID) test sequences to leverage their superior ID performances, while relying on physics-based models for OOD sequences to ensure robust predictions. One key ingredient of the pipeline, named MoEFold2D, is automated ID/OOD detection via consensus analysis of an ensemble of DL model predictions without requiring access to training data during inference. Specifically, motivated by the clustered distribution of known RNA structures, a collection of distinct DL models is trained by iteratively leaving one cluster out. Each DL model hence serves as an expert on all but one cluster in the training data. Consequently, for an ID sequence, all but one DL model makes accurate predictions consistent with one another, while an OOD sequence yields highly inconsistent predictions among all DL models. Through consensus analysis of DL predictions, test sequences are categorized as ID or OOD. ID sequences are subsequently predicted by averaging the DL models in consensus, and OOD sequences are predicted using physics-based models. Instead of remediating generalization gaps with alternative approaches such as transfer learning and sequence alignment, MoEFold2D circumvents unpredictable ID-OOD gaps and combines the strengths of DL and physics-based models to achieve accurate ID and robust OOD predictions.

  • Research Article
  • 10.1186/s12885-025-14971-7
Deep multi-instance learning model based on gadoxetic acid-enhanced MRI for predicting microvascular invasion of hepatocellular carcinoma: a multicenter, retrospective study
  • Oct 22, 2025
  • BMC Cancer
  • Yi Luo + 7 more

ObjectiveMicrovascular invasion (MVI) is of great significance for the individualized treatment of hepatocellular carcinoma (HCC) and preoperative noninvasive prediction of MVI is still an urgent clinical problem. To explore the effects of different regions of interest (ROI) and image input dimensions on the performance of deep learning (DL) models, and to select the best result to develop and validate a DL model for preoperative prediction of MVI.Materials and methodsA total of 206 patients with pathologically confirmed HCC from three hospitals were retrospectively enrolled and divided into training, internal validation and external test set. Based on hepatobiliary phase images (HBP) of gadoxetic acid-enhanced MRI, 2D DL, 3D DL and 2.5D deep multi-instance learning (MIL) models were established. The receiver operating characteristic curve (ROC) was used to evaluate the predictive efficacy of the above models. Based on the optimal performance model, the T1WI-FS and T2WI-FS images were preprocessed correspondingly, and a multimodal prediction model including three sequences was constructed. The ROC, and decision curve were used to visualize the predictive ability of the model.ResultsCompared with 2D DL and 3D DL models, the 2.5D DL model based on all axial images of ROI had the highest performance, with the AUC values of 0.802 (95% CI, 0.669–0.936) and 0.759 (95% CI, 0.643–0.875) in the validation and test sets. The AUCs of the multimodal MRI model were 0.954 (95% CI, 0.920–0.989) in the training set, 0.857 (95% CI, 0.736–0.978) in the validation set, and 0.788 (95% CI, 0.681–0.895) in the test set.ConclusionThe DL model that selects all axial slices of intratumor and peritumor as input shows robust capability in predicting MVI, which is expected to help clinical decision-making of individualized treatment for HCC.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12885-025-14971-7.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.3390/cancers13164077
Deep Learning Analysis of CT Images Reveals High-Grade Pathological Features to Predict Survival in Lung Adenocarcinoma.
  • Aug 13, 2021
  • Cancers
  • Yeonu Choi + 8 more

Simple SummaryThe high-grade pattern (micropapillary or solid pattern, MPSol) in lung adenocarcinoma affects the patient’s poor prognosis. We aimed to develop a deep learning (DL) model for predicting any high-grade patterns in lung adenocarcinoma and to assess the prognostic performance of model in advanced lung cancer patients who underwent neoadjuvant of definitive concurrent chemoradiation therapy (CCRT). Our model considering both tumor and peri-tumoral area showed area under the curve value of 0.8. DL model worked well in independent validation set of advanced lung cancer, stratifying their survival significantly. The subgroup with a high probability of MPSol estimated by the DL model showed a 1.76-fold higher risk of death. Thus, our DL model can be useful in estimating high-grade histologic patterns in lung adenocarcinomas and predicting clinical outcomes of patients with advanced lung cancer who underwent neoadjuvant or definitive CCRT.We aimed to develop a deep learning (DL) model for predicting high-grade patterns in lung adenocarcinomas (ADC) and to assess the prognostic performance of model in advanced lung cancer patients who underwent neoadjuvant or definitive concurrent chemoradiation therapy (CCRT). We included 275 patients with 290 early lung ADCs from an ongoing prospective clinical trial in the training dataset, which we split into internal–training and internal–validation datasets. We constructed a diagnostic DL model of high-grade patterns of lung ADC considering both morphologic view of the tumor and context view of the area surrounding the tumor (MC3DN; morphologic-view context-view 3D network). Validation was performed on an independent dataset of 417 patients with advanced non-small cell lung cancer who underwent neoadjuvant or definitive CCRT. The area under the curve value of the DL model was 0.8 for the prediction of high-grade histologic patterns such as micropapillary and solid patterns (MPSol). When our model was applied to the validation set, a high probability of MPSol was associated with worse overall survival (probability of MPSol >0.5 vs. <0.5; 5-year OS rate 56.1% vs. 70.7%), indicating that our model could predict the clinical outcomes of advanced lung cancer patients. The subgroup with a high probability of MPSol estimated by the DL model showed a 1.76-fold higher risk of death (HR 1.76, 95% CI 1.16–2.68). Our DL model can be useful in estimating high-grade histologic patterns in lung ADCs and predicting clinical outcomes of patients with advanced lung cancer who underwent neoadjuvant or definitive CCRT.

  • Research Article
  • Cite Count Icon 15
  • 10.1016/j.eclinm.2023.101905
Ultrasound image-based deep learning to assist in diagnosing gross extrathyroidal extension thyroid cancer: a retrospective multicenter study
  • Mar 24, 2023
  • eClinicalMedicine
  • Qi Qi + 14 more

Ultrasound image-based deep learning to assist in diagnosing gross extrathyroidal extension thyroid cancer: a retrospective multicenter study

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.ijmedinf.2025.105812
Deep learning and machine learning in CT-based COPD diagnosis: Systematic review and meta-analysis.
  • Apr 1, 2025
  • International journal of medical informatics
  • Qian Wu + 3 more

Deep learning and machine learning in CT-based COPD diagnosis: Systematic review and meta-analysis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1038/s41598-020-79809-7
Development and validation of deep learning algorithms for automated eye laterality detection with anterior segment photography
  • Jan 12, 2021
  • Scientific Reports
  • Ce Zheng + 9 more

This paper aimed to develop and validate a deep learning (DL) model for automated detection of the laterality of the eye on anterior segment photographs. Anterior segment photographs for training a DL model were collected with the Scheimpflug anterior segment analyzer. We applied transfer learning and fine-tuning of pre-trained deep convolutional neural networks (InceptionV3, VGG16, MobileNetV2) to develop DL models for determining the eye laterality. Testing datasets, from Scheimpflug and slit-lamp digital camera photography, were employed to test the DL model, and the results were compared with a classification performed by human experts. The performance of the DL model was evaluated by accuracy, sensitivity, specificity, operating characteristic curves, and corresponding area under the curve values. A total of 14,468 photographs were collected for the development of DL models. After training for 100 epochs, the DL models of the InceptionV3 mode achieved the area under the receiver operating characteristic curve of 0.998 (with 95% CI 0.924–0.958) for detecting eye laterality. In the external testing dataset (76 primary gaze photographs taken by a digital camera), the DL model achieves an accuracy of 96.1% (95% CI 91.7%–100%), which is better than an accuracy of 72.3% (95% CI 62.2%–82.4%), 82.8% (95% CI 78.7%–86.9%) and 86.8% (95% CI 82.5%–91.1%) achieved by human graders. Our study demonstrated that this high-performing DL model can be used for automated labeling for the laterality of eyes. Our DL model is useful for managing a large volume of the anterior segment images with a slit-lamp camera in the clinical setting.

  • Conference Article
  • Cite Count Icon 74
  • 10.1109/ase.2019.00043
Apricot: A Weight-Adaptation Approach to Fixing Deep Learning Models
  • Nov 1, 2019
  • Hao Zhang + 1 more

A deep learning (DL) model is inherently imprecise. To address this problem, existing techniques retrain a DL model over a larger training dataset or with the help of fault injected models or using the insight of failing test cases in a DL model. In this paper, we present Apricot, a novel weight-adaptation approach to fixing DL models iteratively. Our key observation is that if the deep learning architecture of a DL model is trained over many different subsets of the original training dataset, the weights in the resultant reduced DL model (rDLM) can provide insights on the adjustment direction and magnitude of the weights in the original DL model to handle the test cases that the original DL model misclassifies. Apricot generates a set of such reduced DL models from the original DL model. In each iteration, for each failing test case experienced by the input DL model (iDLM), Apricot adjusts each weight of this iDLM toward the average weight of these rDLMs correctly classifying the test case and/or away from that of these rDLMs misclassifying the same test case, followed by training the weight-adjusted iDLM over the original training dataset to generate a new iDLM for the next iteration. The experiment using five state-of-the-art DL models shows that Apricot can increase the test accuracy of these models by 0.87%-1.55% with an average of 1.08%. The experiment also reveals the complementary nature of these rDLMs in Apricot.

  • Research Article
  • Cite Count Icon 16
  • 10.1007/s11356-024-35764-8
An examination of daily CO2 emissions prediction through a comparative analysis of machine learning, deep learning, and statistical models
  • Jan 1, 2025
  • Environmental Science and Pollution Research
  • Adewole Adetoro Ajala + 3 more

Human-induced global warming, primarily attributed to the rise in atmospheric CO2, poses a substantial risk to the survival of humanity. While most research focuses on predicting annual CO2 emissions, which are crucial for setting long-term emission mitigation targets, the precise prediction of daily CO2 emissions is equally vital for setting short-term targets. This study examines the performance of 14 models in predicting daily CO2 emissions data from 1/1/2022 to 30/9/2023 across the top four polluting regions (China, India, the USA, and the EU27&UK). The 14 models used in the study include four statistical models (ARMA, ARIMA, SARMA, and SARIMA), three machine learning models (support vector machine (SVM), random forest (RF), and gradient boosting (GB)), and seven deep learning models (artificial neural network (ANN), recurrent neural network variations such as gated recurrent unit (GRU), long short-term memory (LSTM), bidirectional-LSTM (BILSTM), and three hybrid combinations of CNN-RNN). Performance evaluation employs four metrics (R2, MAE, RMSE, and MAPE). The results show that the machine learning (ML) and deep learning (DL) models, with higher R2 (0.714–0.932) and lower RMSE (0.480–0.247) values, respectively, outperformed the statistical model, which had R2 (− 0.060–0.719) and RMSE (1.695–0.537) values, in predicting daily CO2 emissions across all four regions. The performance of the ML and DL models was further enhanced by differencing, a technique that improves accuracy by ensuring stationarity and creating additional features and patterns from which the model can learn. Additionally, applying ensemble techniques such as bagging and voting improved the performance of the ML models by approximately 9.6%, whereas hybrid combinations of CNN-RNN enhanced the performance of the RNN models. In summary, the performance of both the ML and DL models was relatively similar. However, due to the high computational requirements associated with DL models, the recommended models for daily CO2 emission prediction are ML models using the ensemble technique of voting and bagging. This model can assist in accurately forecasting daily emissions, aiding authorities in setting targets for CO2 emission reduction.

  • Research Article
  • Cite Count Icon 23
  • 10.1007/s11356-021-13503-7
Spatial modelling of soil salinity: deep or shallow learning models?
  • Mar 23, 2021
  • Environmental Science and Pollution Research
  • Aliakbar Mohammadifar + 3 more

Understanding the spatial distribution of soil salinity is required to conserve land against degradation and desertification. Against this background, this study is the first attempt to predict soil salinity in the Jaghin basin, in southern Iran, by applying and comparing the performance of four deep learning (DL) models (deep convolutional neural networks-DCNNs, dense connected deep neural networks-DenseDNNs, recurrent neural networks-long short-term memory-RNN-LSTM and recurrent neural networks-gated recurrent unit-RNN-GRU) and six shallow machine learning (ML) models (bagged classification and regression tree-BCART, cforest, cubist, quantile regression with LASSO penalty-QR-LASSO, ridge regression-RR and support vectore machine-SVM). To do this, 49 environmental landsat8-derived variables including digital elevation model (DEM)-extracted covariates, soil-salinity indices, and other variables (e.g., soil order, lithology, land use) were mapped spatially. For assessing the relationships between soil salinity (EC) and factors controlling EC, we collected 319 surficial (0-5 cm depth) soil samples for measuring soil salinity on the basis of electrical conductivity (EC). We then selected the most important features (covariates) controlling soil salinity by applying a MARS model. The performance of the DL and shallow ML models for generating soil salinity spatial maps (SSSMs) was assessed using a Taylor diagram and the Nash Sutcliff coefficient (NSE). Among all 10 predictive models, DL models with NSE ≥ 0.9 (DCNNs was the most accurate model with NSE = 0.96) were selected as the four best models, and performed better than the six shallow ML models with NSE ≤ 0.83 (QR-LASSO was the weakest predictive model with NSE = 0.50). Based on DCNNs-, the values of the EC ranged between 0.67 and 14.73 dS/m, whereas for QR-LASSO the corresponding EC values were 0.37 to 19.6 dS/m. Overall, DL models performed better than shallow ML models for production of the SSSMs and therefore we recommend applying DL models for prediction purposes in environmental sciences.

  • Research Article
  • Cite Count Icon 5
  • 10.1007/s00330-024-11105-8
MRI deep learning models for assisted diagnosis of knee pathologies: a systematic review
  • Oct 18, 2024
  • European Radiology
  • Keiley Mead + 5 more

ObjectivesDespite showing encouraging outcomes, the precision of deep learning (DL) models using different convolutional neural networks (CNNs) for diagnosis remains under investigation. This systematic review aims to summarise the status of DL MRI models developed for assisting the diagnosis of a variety of knee abnormalities.Materials and methodsFive databases were systematically searched, employing predefined terms such as ‘Knee AND 3D AND MRI AND DL’. Selected inclusion criteria were used to screen publications by title, abstract, and full text. The synthesis of results was performed by two independent reviewers.ResultsFifty-four articles were included. The studies focused on anterior cruciate ligament injuries (n = 19, 36%), osteoarthritis (n = 9, 17%), meniscal injuries (n = 13, 24%), abnormal knee appearance (n = 11, 20%), and other (n = 2, 4%). The DL models in this review primarily used the following CNNs: ResNet (n = 11, 21%), VGG (n = 6, 11%), DenseNet (n = 4, 8%), and DarkNet (n = 3, 6%). DL models showed high-performance metrics compared to ground truth. DL models for the detection of a specific injury outperformed those by up to 4.5% for general abnormality detection.ConclusionDespite the varied study designs used among the reviewed articles, DL models showed promising outcomes in the assisted detection of selected knee pathologies by MRI. This review underscores the importance of validating these models with larger MRI datasets to close the existing gap between current DL model performance and clinical requirements.Key PointsQuestionWhat is the status of DL model availability for knee pathology detection in MRI and their clinical potential?FindingsPathology-specific DL models reported higher accuracy compared to DL models for the detection of general abnormalities of the knee. DL model performance was mainly influenced by the quantity and diversity of data available for model training.Clinical relevanceThese findings should encourage future developments to improve patient care, support personalised diagnosis and treatment, optimise costs, and advance artificial intelligence-based medical imaging practices.Graphical

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.