Efficient Normalized Conformal Prediction and Uncertainty Quantification for Anti-Cancer Drug Sensitivity Prediction with Deep Regression Forests.
Deep learning models are being adopted and applied across various critical medical tasks, yet they are primarily trained to provide point predictions without providing degrees of confidence. Medical practitioner's trustworthiness of deep learning models is increased when paired with uncertainty estimations. Conformal Prediction has emerged as a promising method to pair machine learning models with prediction intervals, allowing for a view of the model's uncertainty. However, popular uncertainty estimation methods for conformal prediction fail to provide highly accurate heteroskedastic intervals. In this paper, we propose a method to estimate the uncertainty of each sample by calculating the variance obtained from a Deep Regression Forest. We show that the deep regression forest variance improves the efficiency and coverage of normalized inductive conformal prediction when applied on an anti-cancer drug sensitivity prediction task.
- Research Article
- 10.5194/soil-11-553-2025
- Jul 22, 2025
- SOIL
Abstract. Uncertainty quantification is a crucial step in the practical application of soil spectral models, particularly in supporting real-world decision making and risk assessment. While machine learning has made remarkable strides in predicting various physiochemical properties of soils using spectroscopy, its practical utility in decision making remains limited without quantified uncertainty. Despite its importance, uncertainty quantification is rarely incorporated into soil spectral models, with existing methods facing significant limitations. Existing methods are either computationally demanding, fail to achieve the desired coverage of observed data, or struggle to handle out-of-domain uncertainty. This study introduces an innovative application of Monte Carlo conformal prediction (MC-CP) to quantify uncertainty in deep-learning models for predicting clay content from mid-infrared spectroscopy. We compared MC-CP with two established methods: (1) Monte Carlo dropout and (2) conformal prediction. Monte Carlo dropout generates prediction intervals for each sample and can address larger uncertainties associated with out-of-domain data. Conformal prediction, on the other hand, guarantees ideal coverage of true values but generates unnecessarily wide prediction intervals, making it overly conservative for many practical applications. Using 39 177 samples from the mid-infrared spectral library of the Kellogg Soil Survey Laboratory to build convolutional neural networks, we found that Monte Carlo dropout itself falls short in achieving the desired coverage – its 90 % prediction intervals only covered the observed values in 74 % of the cases, well below the expected 90 % coverage. In contrast, MC-CP successfully combines the strengths of both methods. It achieved a prediction interval coverage probability of 91 %, closely matching the expected 90 % coverage and far surpassing the performance of the Monte Carlo dropout. Additionally, the mean prediction interval width for MC-CP was 9.05 %, narrower than the conformal prediction's 11.11 %. The success of MC-CP enhances the real-world applicability of soil spectral models, paving the way for their integration into large-scale machine learning models, such as soil inference systems, and further transforming decision making and risk assessment in soil science.
- Research Article
1
- 10.1117/1.jmm.20.4.041206
- Oct 15, 2021
- Journal of Micro/Nanopatterning, Materials, and Metrology
Background: Machine learning is predicted to have an increasingly important role in semiconductor metrology. Prediction intervals that describe the reliability of the predictive performance of machine learning models are important to guide decision making and to improve trust in deep learning and other forms of machine learning and artificial intelligence. Image processing is an important application of artificial intelligence. Low-dose images from the scanning electron microscope (SEM) are often used for roughness measurements such as line edge roughness (LER) because of relatively small acquisition times and resist shrinkage, but such images are corrupted by noise, blur, edge effects, and other instrument errors. LER affects semiconductor device performance and the yield of the manufacturing process. Aim: We consider prediction intervals for the deep convolutional neural network EDGENet, which was trained on a large dataset of simulated SEM images and directly estimates the edge positions from a SEM rough line image containing an unknown level of Poisson noise. Approach: Conformal prediction is a relatively recent, increasingly popular, rigorously proven, and simple methodology to address this need for both classification and regression problems, and it does not use distributional assumptions such as Gaussianity or the Bayesian framework; one new variant combines it with another technique to generate prediction intervals known as quantile regression. Results: We illustrate the strengths and limitations of different conformal prediction procedures for the EDGENet approach to LER estimation. Combining these approaches into ensemble schemes and incorporating domain knowledge produces more informative prediction intervals. Conclusions: Deep learning models can help in the estimation of LER, but their acceptance has been hindered by a lack of trust in these techniques. Prediction intervals that provide coverage guarantees are an approach to alleviate this problem and may catalyze the transition within semiconductor manufacturing to a wider acceptance and implementation of machine learning.
- Research Article
3
- 10.1609/aaaiss.v1i1.27492
- Oct 3, 2023
- Proceedings of the AAAI Symposium Series
Precise estimation of predictive uncertainty in deep neural networks is a critical requirement for reliable decision-making in machine learning and statistical modeling, particularly in the context of medical AI. Conformal Prediction (CP) has emerged as a promising framework for representing the model uncertainty by providing well-calibrated confidence levels for individual predictions. However, the quantification of model uncertainty in conformal prediction remains an active research area, yet to be fully addressed. In this paper, we explore state-of-the-art CP methodologies and their theoretical foundations. We propose a probabilistic approach in quantifying the model uncertainty derived from the produced prediction sets in conformal prediction and provide certified boundaries for the computed uncertainty. By doing so, we allow model uncertainty measured by CP to be compared by other uncertainty quantification methods such as Bayesian (e.g., MC-Dropout and DeepEnsemble) and Evidential approaches.
- Research Article
- 10.1148/ryai.240032
- Nov 27, 2024
- Radiology. Artificial intelligence
Purpose To apply conformal prediction to a deep learning (DL) model for intracranial hemorrhage (ICH) detection and evaluate model performance in detection as well as model accuracy in identifying challenging cases. Materials and Methods This was a retrospective (November-December 2017) study of 491 noncontrast head CT volumes from the CQ500 dataset, in which three senior radiologists annotated sections containing ICH. The dataset was split into definite and challenging (uncertain) subsets, in which challenging images were defined as those in which there was disagreement among readers. A DL model was trained on patients from the definite data (training dataset) to perform ICH localization and classification into five classes. To develop an uncertainty-aware DL model, 1546 sections of the definite data (calibration dataset) were used for Mondrian conformal prediction (MCP). The uncertainty-aware DL model was tested on 8401 definite and challenging sections to assess its ability to identify challenging sections. The difference in predictive performance (P value) and ability to identify challenging sections (accuracy) were reported. Results The study included 146 patients (mean age, 45.7 years ± 9.9 [SD]; 76 [52.1%] men, 70 [47.9%] women). After the MCP procedure, the model achieved an F1 score of 0.919 for localization and classification. Additionally, it correctly identified patients with challenging cases with 95.3% (143 of 150) accuracy. It did not incorrectly label any definite sections as challenging. Conclusion The uncertainty-aware MCP-augmented DL model achieved high performance in ICH detection and high accuracy in identifying challenging sections, suggesting its usefulness in automated ICH detection and potential to increase trustworthiness of DL models in radiology. Keywords: CT, Head and Neck, Brain, Brain Stem, Hemorrhage, Feature Detection, Diagnosis, Supervised Learning Supplemental material is available for this article. © RSNA, 2025 See also commentary by Ngum and Filippi in this issue.
- Research Article
1
- 10.1002/ese3.1710
- Feb 27, 2024
- Energy Science & Engineering
In recent years, machine and deep learning models have attracted significant attention for electricity price forecast in global wholesale electricity markets. Yet, a predominant focus on point forecast in most parts of literature limits the practical application of these models due to the absence of uncertainty quantification. In this study, we first perform an analysis of the electricity price trends in the Mexican wholesale electricity market to determine the influence of key variables. Using independent component analysis and wavelet coherence analysis, we were able to identify primary determinants influencing locational marginal electricity prices. Subsequently, we applied four different models covering the most important algorithms proposed in the literature for electricity price forecast. Our findings revealed that the most accurate forecasting results were achieved using a deep learning‐based method with a decision tree‐based model trailing closely. Finally, we incorporate conformal prediction for uncertainty quantification by calculating the prediction intervals with a target coverage level of 95%. The conformal prediction intervals provide a more comprehensive view of the possible future scenarios, enhancing economic efficiency, risk management, and decision‐making processes. This is particularly important because of the dynamic nature of electricity markets, where prices are strongly influenced by multiple factors.
- Conference Article
3
- 10.1109/ssci50451.2021.9659853
- Dec 5, 2021
Conformal prediction can be applied on top of any machine learning predictive regression model, thus turning it into a conformal regressor. Given a significance level <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\epsilon$</tex> , conformal regressors output valid prediction intervals, i.e., the probability that the interval covers the true value is exactly <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$1-\epsilon$</tex> . To obtain validity, a calibration set that is not used for training the model must be set aside. In standard inductive conformal regression, the size of the prediction intervals is then determined by the absolute error made by the predictive model on a specific instance in the calibration set, where different significance levels correspond to different instances. In this setting, all prediction intervals will have the same size, making the resulting models very unspecific. When adding a technique called normalization, however, the difficulty of each instance is estimated, and the interval sizes are adjusted accordingly. An integral part of normalized conformal regressors is a parameter called <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> , which determines the relative importance of the difficulty estimation and the error of the model. In this study, the effects of different underlying models, difficulty estimation functions and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -values are investigated. The results from a large empirical study, using twenty publicly available data sets, show that better difficulty estimation functions will lead to both tighter and more specific prediction intervals. Furthermore, it is found that the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -values used strongly affect the conformal regressor. While there is no specific <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -value that will always minimize the interval sizes, lower <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -values lead to more variation in the interval sizes, i.e., more specific models. In addition, the analysis also identifies that the normalization procedure introduces a small but unfortunate bias in the models. More specifically, normalization using low <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -values means that smaller intervals are more likely to be erroneous, while the opposite is true for higher <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\beta$</tex> -values.
- Research Article
1
- 10.1186/s13244-024-01863-w
- Nov 29, 2024
- Insights into Imaging
ObjectivesTo estimate the uncertainty of a deep learning (DL)-based prostate segmentation algorithm through conformal prediction (CP) and to assess its effect on the calculation of the prostate volume (PV) in patients at risk of prostate cancer (PC).MethodsThree-hundred seventy-seven multi-center 3-Tesla axial T2-weighted exams from biopsied males (66.64 ±\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\pm$$\\end{document} 7.47 years) at risk of PC were retrospectively included in the study. Assessment of PV based on PI-RADS 2.1 ellipsoid formula (PVref\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{ref}}$$\\end{document}) was available for included patients. Prostate segmentations were obtained from a DL model and used to calculate the PV (PVDL\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{DL}}$$\\end{document}). CP was applied at a confidence level of 85% to flag unreliable pixel segmentations of the DL model. Subsequently, the PV (PVCP\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{CP}}$$\\end{document}) was calculated when disregarding uncertain pixel segmentations. Agreement between PVDL\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{DL}}$$\\end{document} and PVCP\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{CP}}$$\\end{document} was evaluated against the reference standard PVref\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{ref}}$$\\end{document}. Intraclass correlation coefficient (ICC) and Bland–Altman plots were used to assess the agreement. The relative volume difference (RVD) was used to evaluate the PV calculation accuracy, and the Wilcoxon Signed-Rank Test was used to assess statistical differences. A p-value < 0.05 was considered statistically significant.ResultsConformal prediction significantly reduced RVD when compared to the DL algorithm (RVD = − 2.81 ±\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\pm$$\\end{document} 8.85 and RVD = −8.01 ±\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\pm$$\\end{document} 11.50). PVCP\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{CP}}$$\\end{document} showed a significantly larger agreement than PVDL\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{DL}}$$\\end{document} when using the reference standard PVref\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{ref}}$$\\end{document} (mean difference (95% limits of agreement) PVCP\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{CP}}$$\\end{document}: 1.27 mL (− 13.64; 16.17 mL) PVDL\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{DL}}$$\\end{document}: 6.07 mL (− 14.29; 26.42 mL)), with an excellent ICC (PVCP\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$${{{\\rm{PV}}}}_{{CP}}$$\\end{document}: 0.97 (95% CI: 0.97 to 0.98)).ConclusionUncertainty quantification through CP increases the accuracy and reliability of DL-based PV assessment in patients at risk of PC.Critical relevance statementConformal prediction can flag uncertain pixel predictions of DL-based prostate MRI segmentation at a desired confidence level, increasing the reliability and safety of prostate volume assessment in patients at risk of prostate cancer.Key PointsConformal prediction can flag uncertain pixel predictions of prostate segmentations at a user-defined confidence level.Deep learning with conformal prediction shows high accuracy in prostate volumetric assessment.Agreement between automatic and ellipsoid-derived volume was significantly larger with conformal prediction.Graphical
- Research Article
79
- 10.1016/j.xphs.2020.09.055
- Oct 17, 2020
- Journal of Pharmaceutical Sciences
One of the challenges with predictive modeling is how to quantify the reliability of the models' predictions on new objects. In this work we give an introduction to conformal prediction, a framework that sits on top of traditional machine learning algorithms and which outputs valid confidence estimates to predictions from QSAR models in the form of prediction intervals that are specific to each predicted object. For regression, a prediction interval consists of an upper and a lower bound. For classification, a prediction interval is a set that contains none, one, or many of the potential classes. The size of the prediction interval is affected by a user-specified confidence/significance level, and by the nonconformity of the predicted object; i.e., the strangeness as defined by a nonconformity function. Conformal prediction provides a rigorous and mathematically proven framework for in silico modeling with guarantees on error rates as well as a consistent handling of the models’ applicability domain intrinsically linked to the underlying machine learning model. Apart from introducing the concepts and types of conformal prediction, we also provide an example application for modeling ABC transporters using conformal prediction, as well as a discussion on general implications for drug discovery.
- Research Article
- 10.1007/s10994-024-06722-9
- Jun 9, 2025
- Machine Learning
Time-series forecasts underpin decision-making processes in a wide range of application domains. Recently it has been shown that these processes can be strengthened by conformal prediction, a framework that allows adding prediction intervals to point forecasts. The prediction intervals quantify the uncertainty of a predictive model with mathematical coverage guarantees, giving the user a range of scenarios to consider. However, applying conformal prediction to time-series tasks is not trivial. This is either because the exchangeability condition the framework places on the data is violated, or because the framework only allows for one-step-ahead univariate forecasts. In this article we combine two existing methods derived from conformal prediction, one built for multi-target regression and one designed to handle non-exchangeable data. The resulting method, called non-exchangeable multi-target conformal prediction (nmtCP) produces provably robust prediction regions for multi-step ahead multidimensional time-series forecasts, meaning that the miscoverage rate is bound. Additionally, nmtCP is computationally efficient and easy to implement. Due to its model-agnostic nature, nmtCP can be used on top of any time-series model that produces point forecasts. A theoretical analysis proves the method’s robustness while experiments on real-world data sets give insights into its practical behavior and performance.
- Research Article
10
- 10.3390/rs16030438
- Jan 23, 2024
- Remote Sensing
Soil organic carbon (SOC) contents and stocks provide valuable insights into soil health, nutrient cycling, greenhouse gas emissions, and overall ecosystem productivity. Given this, remote sensing data coupled with advanced machine learning (ML) techniques have eased SOC level estimation while revealing its patterns across different ecosystems. However, despite these advances, the intricacies of training reliable and yet certain SOC models for specific end-users remain a great challenge. To address this, we need robust SOC uncertainty quantification techniques. Here, we introduce a methodology that leverages conformal prediction to address the uncertainty in estimating SOC contents while using remote sensing data. Conformal prediction generates statistically reliable uncertainty intervals for predictions made by ML models. Our analysis, performed on the LUCAS dataset in Europe and incorporating a suite of relevant environmental covariates, underscores the efficacy of integrating conformal prediction with another ML model, specifically random forest. In addition, we conducted a comparative assessment of our results against prevalent uncertainty quantification methods for SOC prediction, employing different evaluation metrics to assess both model uncertainty and accuracy. Our methodology showcases the utility of the generated prediction sets as informative indicators of uncertainty. These sets accurately identify samples that pose prediction challenges, providing valuable insights for end-users seeking reliable predictions in the complexities of SOC estimation.
- Research Article
- 10.1017/pan.2025.10010
- Jul 30, 2025
- Political Analysis
Forecasting of armed conflicts is a critical area of research with the potential to save lives and mitigate suffering. While existing forecasting models offer valuable point predictions, they often lack individual-level uncertainty estimates, limiting their usefulness for decision-making. Several approaches exist to estimate uncertainty, such as parametric and Bayesian prediction intervals, bootstrapping, quantile regression, but these methods often rely on restrictive assumptions, struggle to provide well-calibrated intervals across the full range of outcomes, or are computationally intensive. Conformal prediction offers a model-agnostic alternative that guarantees a user-specified level of coverage but typically provides only marginal coverage, potentially resulting in non-uniform coverage across different regions of the outcome space. In this article, we introduce a novel extension called bin-conditional conformal prediction (BCCP), which enhances standard conformal prediction (SCP) by ensuring consistent coverage rates across user-defined subsets (bins) of the outcome variable. We apply BCCP to simulated data as well as the forecasting of fatalities from armed conflicts, and demonstrate that it provides well-calibrated uncertainty estimates across various ranges of the outcome. Compared to SCP, BCCP offers improved local coverage, though this comes at the cost of slightly wider prediction intervals.
- Research Article
4
- 10.1007/s11356-024-35764-8
- Jan 1, 2025
- Environmental Science and Pollution Research
Human-induced global warming, primarily attributed to the rise in atmospheric CO2, poses a substantial risk to the survival of humanity. While most research focuses on predicting annual CO2 emissions, which are crucial for setting long-term emission mitigation targets, the precise prediction of daily CO2 emissions is equally vital for setting short-term targets. This study examines the performance of 14 models in predicting daily CO2 emissions data from 1/1/2022 to 30/9/2023 across the top four polluting regions (China, India, the USA, and the EU27&UK). The 14 models used in the study include four statistical models (ARMA, ARIMA, SARMA, and SARIMA), three machine learning models (support vector machine (SVM), random forest (RF), and gradient boosting (GB)), and seven deep learning models (artificial neural network (ANN), recurrent neural network variations such as gated recurrent unit (GRU), long short-term memory (LSTM), bidirectional-LSTM (BILSTM), and three hybrid combinations of CNN-RNN). Performance evaluation employs four metrics (R2, MAE, RMSE, and MAPE). The results show that the machine learning (ML) and deep learning (DL) models, with higher R2 (0.714–0.932) and lower RMSE (0.480–0.247) values, respectively, outperformed the statistical model, which had R2 (− 0.060–0.719) and RMSE (1.695–0.537) values, in predicting daily CO2 emissions across all four regions. The performance of the ML and DL models was further enhanced by differencing, a technique that improves accuracy by ensuring stationarity and creating additional features and patterns from which the model can learn. Additionally, applying ensemble techniques such as bagging and voting improved the performance of the ML models by approximately 9.6%, whereas hybrid combinations of CNN-RNN enhanced the performance of the RNN models. In summary, the performance of both the ML and DL models was relatively similar. However, due to the high computational requirements associated with DL models, the recommended models for daily CO2 emission prediction are ML models using the ensemble technique of voting and bagging. This model can assist in accurately forecasting daily emissions, aiding authorities in setting targets for CO2 emission reduction.
- Research Article
1
- 10.1016/j.engappai.2024.109724
- Nov 29, 2024
- Engineering Applications of Artificial Intelligence
Towards robust ferrous scrap material classification with deep learning and conformal prediction
- Research Article
8
- 10.1111/exsy.13153
- Oct 5, 2022
- Expert Systems
The increase in the number of undesired SMS termed smishing message and the data imbalance problem has generated a great demand for the development of more reliable anti‐spam filters. State of the art machine learning approaches are being employed to recognize and separate spam messages. Most recent studies target message classification by using numerous properties and features of the words but fail to consider the circumstantial features like long‐range dependencies between the words that are extremely important in identifying smishing messages. The idea is to develop an intelligent model that will distinguish between smishing messages and ham messages, by adopting a combined approach of regular expression (Regex), machine learning (ML) and deep learning (DL) models. Regex rules are generated using the dataset's spam messages for the purpose of refining the dataset. Support vector machine (SVM), Multinomial Naive Bayes and Random Forest are included under machine learning models and long short‐term memory (LSTM), bidirectional long short‐term memory (Bi‐LSTM), stacked LSTM and stacked Bi‐LSTM are included under deep learning models. The comparison between machine learning models and deep learning models is also carried out based on the performance evaluation parameters namely accuracy, precision, recall and F1 score of the models. It is observed that deep learning models perform better than machine learning models and the introduction of a regular expression to the dataset increases the efficiency of both the deep learning models and machine learning models.
- Research Article
17
- 10.1016/j.neucom.2021.12.075
- Jan 1, 2022
- Neurocomputing
DBC-Forest: Deep forest with binning confidence screening
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.