From point to probabilistic gradient boosting for claim frequency and severity prediction
Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalised linear models. Many enhancements to the first gradient boosting machine algorithm exist. We present in a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms: GBM, XGBoost, DART, LightGBM, CatBoost, EGBM, PGBM, XGBoostLSS, cyclic GBM, and NGBoost. In this comprehensive numerical study, we compare their performance on five publicly available datasets for claim frequency and severity, of various sizes and comprising different numbers of (high cardinality) categorical variables. We explain how varying exposure-to-risk can be handled with boosting in frequency models. We compare the algorithms on the basis of computational efficiency, predictive performance, and model adequacy. LightGBM and XGBoostLSS win in terms of computational efficiency. CatBoost sometimes improves predictive performance, especially in the presence of high cardinality categorical variables, common in actuarial science. The fully interpretable EGBM achieves competitive predictive performance compared to the black box algorithms considered. We find that there is no trade-off between model adequacy and predictive accuracy: both are achievable simultaneously.
33477
- 10.1145/2939672.2939785
- Aug 13, 2016
2
- 10.1016/j.insmatheco.2024.04.003
- May 6, 2024
- Insurance Mathematics and Economics
10
- 10.1080/03461238.2024.2365390
- Jun 11, 2024
- Scandinavian Actuarial Journal
409
- 10.1145/2487575.2487579
- Aug 11, 2013
40
- 10.1080/10920277.2018.1431131
- Apr 20, 2018
- North American Actuarial Journal
1
- 10.1002/sam.11599
- Sep 28, 2022
- Statistical Analysis and Data Mining: The ASA Data Science Journal
19307
- 10.1007/978-0-387-84858-7
- Jan 1, 2009
5
- 10.1080/10920277.2023.2202707
- May 6, 2023
- North American Actuarial Journal
1
- 10.1080/10920277.2025.2451860
- Feb 19, 2025
- North American Actuarial Journal
20
- 10.1016/j.insmatheco.2021.09.001
- Sep 13, 2021
- Insurance: Mathematics and Economics
- Research Article
3
- 10.2118/210577-pa
- Jun 14, 2022
- SPE Reservoir Evaluation & Engineering
Summary This paper investigates the computational behaviors of simple-to-use, relatively fast, and versatile machine learning (ML) methods to predict apparent viscosity, a key rheological property of nanoparticle-surfactant-stabilized CO2 foam in unconventional reservoir fracturing. The first novelty of our study is the investigation of the predictive performance of ML approaches as viable alternatives for predicting the apparent viscosity of NP-Surf-CO2 foams. The predictive and computational performance of five nonlinear ML algorithms were first compared. Support vector regression (SVR), K-nearest neighbors (KNN), classification and regression trees (CART), feed-forward multilayer perceptron neural network (MLPNN), and multivariate polynomial regression (MPR) algorithms were used to create models. Temperature, foam quality, pressure, salinity, shear rate, nanoparticle size, nanoparticle concentration, and surfactant concentration were identified as relevant input parameters using principal component analysis (PCA). A data set containing 329 experimental data records was used in the study. In building the models, 80% of the data set was used for training and 20% of the data set for testing. Another unique aspect of this research is the examination of diverse ensemble learning techniques for improving computational performance. We developed meta-models of the generated models by implementing various ensemble learning algorithms (bagging, boosting, and stacking). This was done to explore and compare the computational and predictive performance enhancements of the base models (if any). To determine the relative significance of the input parameters on prediction accuracy, we used permutation feature importance (PFI). We also investigated how the SVR model made its predictions by utilizing the SHapely Additive exPlanations (SHAP) technique to quantify the influence of each input parameter on prediction. This work’s application of the SHAP approach in the interpretation of ML findings in predicting apparent viscosity is also novel. On the test data, the SVR model in this work had the best predictive performance of the single models, with an R2 of 0.979, root mean squared error (RMSE) of 0.885 cp, and mean absolute error (MAE) of 0.320 cp. Blending, a variant of the stacking ensemble technique, significantly improved this performance. With an R2 of 1.0, RMSE of 0.094 cp, and MAE of 0.087 cp, an SVR-based meta-model ensembled with blending outperformed all single and ensemble models in predicting apparent viscosity. However, in terms of computational time, the blended SVR-based meta-model did not outperform any of its constituent models. PCA and PFI ranked temperature as the most important factor in predicting the apparent viscosity of NP-Surf-CO2 foams. The ML approach used in this study provides a comprehensive understanding of the nonlinear relationship between the investigated factors and apparent viscosity. The workflow can be used to evaluate the apparent viscosity of NP-Surf-CO2 foam fracturing fluid efficiently and effectively.
- Research Article
34
- 10.1016/j.neunet.2014.05.013
- Jun 13, 2014
- Neural Networks
Effect of hybrid circle reservoir injected with wavelet-neurons on performance of echo state network
- Research Article
2
- 10.2118/209234-pa
- Apr 7, 2022
- SPE Reservoir Evaluation & Engineering
Summary Production forecasting is usually performed by applying a single model from a classical statistical standpoint (point estimation). This approach neglects: (a) model uncertainty and (b) quantification of uncertainty of the model’s estimates. This work evaluates the predictive accuracy of rate-time models to forecast production from tight-oil wells using Bayesian methods. We apply Bayesian leave-one-out (LOO) and leave-future-out (LFO) cross-validation (CV) using an accuracy metric that evaluates the uncertainty of the models’ estimates: the expected log predictive density (elpd). We illustrate the application of the procedure to tight-oil wells of west Texas. This work assesses the predictive accuracy of rate-time models to forecast production of tight-oil wells. We use two empirical models, the Arps hyperbolic and logistic growth models, and two physics-based models: scaled slightly compressible single-phase and scaled two-phase (oil and gas) solutions of the diffusivity equation. First, we perform Bayesian inference to generate probabilistic production forecasts for each model using a Bayesian workflow in which we assess the convergence of the Markov chain Monte Carlo (MCMC) algorithm, calibrate, and evaluate the robustness of the models’ inferences. Second, we evaluate the predictive accuracy of the models using the elpd accuracy metric. This metric provides a measure of out-of-sample predictive performance. We apply two different CV techniques: LOO and LFO. The results of this study are the following. First, we evaluate the predictive performance of the models using the elpd accuracy metric, which accounts for the uncertainty of the models’ estimates assessing distributions instead of point estimates. Second, we perform fast CV calculations using an important sampling technique to evaluate and compare the results of the application of two CV techniques: leave-one-out cross-validation (LOO-CV) and leave-future-out cross-validation (LFO-CV). While the goal of LOO-CV is to evaluate the models’ ability to accurately resemble the structure of the production data, LFO-CV aims to assess the models’ capacity to predict future-time production (honoring the time-dependent structure of the data). Despite the difference in their prediction goals, both methods yield similar results on the set of tight-oil wells under study. The logistic growth model yields the best predictive performance for most of the wells in the data set, followed by the two-phase physics-based flow model. This work shows the application of new tools to evaluate the predictive accuracy of models used to forecast production of tight-oil wells using: (a) an accuracy metric that accounts for the uncertainty of the models’ estimates and (b) fast computation of two CV techniques, LOO-CV and LFO-CV. To our knowledge, the proposed approach is novel and suitable to evaluate and eventually select the rate-time model(s) with the best predictive accuracy of models to forecast hydrocarbon production.
- Conference Article
1
- 10.2118/205151-ms
- Oct 18, 2021
A common industry practice is to select a particular model from a set of models to history match oil production and estimate reserves by extrapolation. Future production forecasting is usually done in this deterministic way. However, this approach neglects: a) model uncertainty, and b) quantification of uncertainty of future production forecasts. The current study evaluates the predictive accuracy of rate-time models to forecast production over a set of tight oil wells of West Texas. We present the application of an accuracy metric that evaluates the uncertainty of our models' estimates: the expected log predictive density (elpd). This work assesses the predictive performance of two empirical models—the Arps hyperbolic and the logistic growth models—and two physics-based models—scaled slightly compressible single-phase and scaled two-phase (oil and gas) solutions of the diffusivity equation. These models are arbitrarily selected for the purpose of illustrating the statistical procedure shown in this paper. First, we perform classical regression with the models and evaluate their predictive performance using frequentist (point estimates) metrics such as R2, the Akaike information criteria (AIC), and hindcasting. Second, we generate probabilistic production forecasts using Bayesian inference for each model. Third, we evaluate the predictive accuracy of the models using the elpd accuracy metric. This metric evaluates a measure of out-of-sample predictive performance. We apply both adjusted-within-sample and cross-validation techniques. The adjusted within-sample method is the widely applicable information criteria (WAIC). The cross-validation techniques are hindcasting and leave-one-out (LOO-CV) method. The results of this research are the following. First, we illustrate that the assessment of a model's predictive accuracy depends on whether we use frequentist or Bayesian approaches. This is an important finding in this work. The frequentist approach relies on point estimates while the Bayesian approach considers the uncertainty of our models' estimates. From a frequentist or classical standpoint, all of the models under study yielded very similar results which made it difficult to determine which model yielded the best predictive performance. From a Bayesian standpoint, however, we determined that the logistic growth model yielded a best match in 81 of 130 wells in our sample play and the two-phase physics-based model yielded a best match in 39 of the wells. In addition, we show that WAIC and LOO-CV present similar results for each model, a thing to expect because of their asymptotical equivalence. Finally, Our observations regarding the different models are subject to the dataset under study wherein a majority of the wells are in transient flow. The present study provides tools to evaluate the predictive accuracy of models used to forecast (extrapolate) production of tight oil wells. The elpd is an accuracy metric useful to evaluate the uncertainty of our models' estimates and compare their predictive performance since it assesses distributions instead of point estimates. To our knowledge, the proposed approach is a novel and an appropriate technique to evaluate the predictive accuracy of models to forecast hydrocarbon production.
- Research Article
2
- 10.1038/s41598-024-70577-2
- Aug 30, 2024
- Scientific Reports
Upper gastrointestinal bleeding (UGIB) is a common cause of hospital admission worldwide and several risk scores have been developed to predict clinically relevant outcomes. Despite the geriatric population being a high-risk group, age is often overlooked in the assessment of many risk scores. In this study we aimed to compare the predictive accuracy of six pre-endoscopic risk scoring systems in a geriatric population hospitalised with UGIB. We conducted a multi-center cross-sectional study and recruited 136 patients, 67 of these were 65–81.9 years old (“< 82 years”), 69 were 82–100 years old (“≥ 82 years”). We performed six pre-endoscopic risk scores very commonly used in clinical practice (i.e. Glasgow-Blatchford Bleeding and its modified version, T-score, MAP(ASH), Canada–United Kingdom–Adelaide, AIMS65) in both age cohorts and compared their accuracy in relevant outcomes predictions: 30-days mortality since hospitalization, a composite outcome (need of red blood transfusions, endoscopic treatment, rebleeding) and length of hospital stay. T-score showed a significantly worse performance in mortality prediction in the “≥ 82 years” group (AUROC 0.53, 95% CI 0.27–0.75) compared to “< 82 years” group (AUROC 0.88, 95% CI 0.77–0.99). In the composite outcome prediction, except for T-score, younger participants had higher sensitivities than those in the “≥ 82 years” group. All risk scores showed low performances in the prediction of length of stay (AUROCs ≤ 0.70), and, except for CANUKA score, there was a significant difference in terms of accuracy among age cohorts. Most used UGIB risk scores have a low accuracy in the prediction of clinically relevant outcomes in the geriatric population; hence novel scores should account for age or advanced age in their assessment.
- Research Article
- 10.1038/s41598-025-21842-5
- Oct 22, 2025
- Scientific reports
In contemporary financial markets, accurate risk prediction is critical for market participants. Existing approaches often face limitations in computational efficiency, model complexity, and predictive performance. This study proposes a novel Quantum-Inspired Chimpanzee Optimization Algorithm with Kernel Extreme Learning Machine (QChOA-KELM) for financial risk prediction. The methodology combines quantum computing principles with metaheuristic optimization to enhance the KELM's parameter selection, improving both prediction accuracy and model robustness. Experimental validation using a Kaggle-sourced financial risk dataset demonstrates the model's superior performance: QChOA-KELM achieves a 10.3% accuracy improvement over baseline KELM and outperforms conventional methods by at least 9% across evaluation metrics. The results indicate that our approach provides an effective computational framework for financial risk assessment, offering significant advantages in predictive performance while maintaining computational efficiency.
- Research Article
23
- 10.1186/s12864-023-09933-x
- Feb 7, 2024
- BMC Genomics
BackgroundThe accurate prediction of genomic breeding values is central to genomic selection in both plant and animal breeding studies. Genomic prediction involves the use of thousands of molecular markers spanning the entire genome and therefore requires methods able to efficiently handle high dimensional data. Not surprisingly, machine learning methods are becoming widely advocated for and used in genomic prediction studies. These methods encompass different groups of supervised and unsupervised learning methods. Although several studies have compared the predictive performances of individual methods, studies comparing the predictive performance of different groups of methods are rare. However, such studies are crucial for identifying (i) groups of methods with superior genomic predictive performance and assessing (ii) the merits and demerits of such groups of methods relative to each other and to the established classical methods. Here, we comparatively evaluate the genomic predictive performance and informally assess the computational cost of several groups of supervised machine learning methods, specifically, regularized regression methods, deep, ensemble and instance-based learning algorithms, using one simulated animal breeding dataset and three empirical maize breeding datasets obtained from a commercial breeding program.ResultsOur results show that the relative predictive performance and computational expense of the groups of machine learning methods depend upon both the data and target traits and that for classical regularized methods, increasing model complexity can incur huge computational costs but does not necessarily always improve predictive accuracy. Thus, despite their greater complexity and computational burden, neither the adaptive nor the group regularized methods clearly improved upon the results of their simple regularized counterparts. This rules out selection of one procedure among machine learning methods for routine use in genomic prediction. The results also show that, because of their competitive predictive performance, computational efficiency, simplicity and therefore relatively few tuning parameters, the classical linear mixed model and regularized regression methods are likely to remain strong contenders for genomic prediction.ConclusionsThe dependence of predictive performance and computational burden on target datasets and traits call for increasing investments in enhancing the computational efficiency of machine learning algorithms and computing resources.
- Research Article
7
- 10.3390/data2010005
- Jan 18, 2017
- Data
The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial to the number of predictor variables in the model. We relax these global constraints to a more generalizable local structure (BRL-LSS). BRL-LSS entails more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data.
- Research Article
2
- 10.1186/s40644-024-00697-5
- Apr 16, 2024
- Cancer Imaging
BackgroundCombining conventional radiomics models with deep learning features can result in superior performance in predicting the prognosis of patients with tumors; however, this approach has never been evaluated for the prediction of metachronous distant metastasis (MDM) among patients with retroperitoneal leiomyosarcoma (RLS). Thus, the purpose of this study was to develop and validate a preoperative contrast-enhanced computed tomography (CECT)-based deep learning radiomics model for predicting the occurrence of MDM in patients with RLS undergoing complete surgical resection.MethodsA total of 179 patients who had undergone surgery for the treatment of histologically confirmed RLS were retrospectively recruited from two tertiary sarcoma centers. Semantic segmentation features derived from a convolutional neural network deep learning model as well as conventional hand-crafted radiomics features were extracted from preoperative three-phase CECT images to quantify the sarcoma phenotypes. A conventional radiomics signature (RS) and a deep learning radiomics signature (DLRS) that incorporated hand-crafted radiomics and deep learning features were developed to predict the risk of MDM. Additionally, a deep learning radiomics nomogram (DLRN) was established to evaluate the incremental prognostic significance of the DLRS in combination with clinico-radiological predictors.ResultsThe comparison of the area under the curve (AUC) values in the external validation set, as determined by the DeLong test, demonstrated that the integrated DLRN, DLRS, and RS models all exhibited superior predictive performance compared with that of the clinical model (AUC 0.786 [95% confidence interval 0.649–0.923] vs. 0.822 [0.692–0.952] vs. 0.733 [0.573–0.892] vs. 0.511 [0.359–0.662]; both P < 0.05). The decision curve analyses graphically indicated that utilizing the DLRN for risk stratification provided greater net benefits than those achieved using the DLRS, RS and clinical models. Good alignment with the calibration curve indicated that the DLRN also exhibited good performance.ConclusionsThe novel CECT-based DLRN developed in this study demonstrated promising performance in the preoperative prediction of the risk of MDM following curative resection in patients with RLS. The DLRN, which outperformed the other three models, could provide valuable information for predicting surgical efficacy and tailoring individualized treatment plans in this patient population.Trial registration: Not applicable.
- Research Article
- 10.1016/j.acra.2025.03.004
- Jul 1, 2025
- Academic radiology
A Novel Visual Model for Predicting Prognosis of Resected Hepatoblastoma: A Multicenter Study.
- Research Article
13
- 10.1071/hc19087
- Jan 1, 2020
- Journal of primary health care
INTRODUCTION Precision medical practice emphasises early detection, improved surveillance and prevention through targeted intervention. Prediction models can help identify high-risk individuals to be targeted for healthy behavioural changes or medical treatment to prevent disease development and assist both health professionals and patients to make informed decisions. Concerns exist regarding the adequacy, accuracy, validity and reliability of prediction models. AIM The purpose of this study is to introduce readers to the basic concept of prediction modelling in precision health and recommend factors to consider before implementing a prediction model in clinical practice. METHODS Prediction models developed maintaining proper process and with quality prediction and validation can be used in clinical practice to improve patient care. RESULTS Aspects of prediction models that should be considered before implementation include: appropriateness of the model for the intended purpose; adequacy of the model; validation, face validity and clinical impact studies of the model; a parsimonious model with data easily measured in clinical settings; and easily accessible models with decision support for successful implementation. DISCUSSION Choosing clinical prediction models requires cautious consideration and several practical factors before implementing a model in clinical practice.
- Research Article
7
- 10.1007/s10994-021-06091-7
- Nov 12, 2021
- Machine Learning
The proliferation of data collection technologies often results in large data sets with many observations and many variables. In practice, highly relevant engineered features are often groups of predictors that share a common regression coefficient (i.e., the predictors in the group affect the response only via their collective sum), where the groups are unknown in advance and must be discovered from the data. We propose an algorithm called coefficient tree regression (CTR) to discover the group structure and fit the resulting regression model. In this regard CTR is an automated way of engineering new features, each of which is the collective sum of the predictors within each group. The algorithm can be used when the number of variables is larger than, or smaller than, the number of observations. Creating new features that affect the response in a similar manner improves predictive modeling, especially in domains where the relationships between predictors are not known a priori. CTR borrows computational strategies from both linear regression (fast model updating when adding/modifying a feature in the model) and regression trees (fast partitioning to form and split groups) to achieve outstanding computational and predictive performance. Finding features that represent hidden groups of predictors (i.e., a hidden ontology) that impact the response only via their sum also has major interpretability advantages, which we demonstrate with a real data example of predicting political affiliations with television viewing habits. In numerical comparisons over a variety of examples, we demonstrate that both computational expense and predictive performance are far superior to existing methods that create features as groups of predictors. Moreover, CTR has overall predictive performance that is comparable to or slightly better than the regular lasso method, which we include as a reference benchmark for comparison even though it is non-group-based, in addition to having substantial computational and interpretive advantages over lasso.
- Research Article
197
- 10.1038/nmeth.2259
- Dec 1, 2012
- Nature Methods
Flaws in evaluation schemes for pair-input computational predictions
- Conference Article
33
- 10.1109/seaa.2015.25
- Aug 1, 2015
Context: Cross-project defect prediction (CPDP) research has been popular and many CPDP methods were proposed. While these methods used cross-project data as is for their inputs, useless or noisy information in the cross-project data can cause the degradation of predictive and computation performance. Removing such information makes the cross-project data simple and it will affect the performance of CPDP methods. Objective: To identify and quantify the effects of the data simplification for CPDP methods. Method: We conducted experiments that compared the predictive performance between CPDP with and without the data simplification. We adopted a data simplification method based on an active learning method proposed for software effort estimation. The experiments adopted 44 versions of OSS projects, four prediction models, and two CPDP methods, namely, Burak-filter and cross-project selection. Results: The data simplification achieved significant improvement in predictive performance for the cross-project selection. It did not improve Burak-filter. Conclusion: The data simplification can be helpful for the cross-project selection in terms of predictive performance and size reduction of cross-project data.
- Research Article
2
- 10.1002/aro2.87
- Oct 1, 2024
- Animal Research and One Health
Spotted sea bass (Lateolabrax maculatus) is a species of significant economic importance in aquaculture. However, genetic degeneration, such as declining growth performance, has severely impeded industry development, necessitating urgent genetic improvement. Here, we conducted a genome‐wide association study (GWAS) and genomic prediction for growth traits using insertion and deletion (InDel) markers, and systematically compared the results with our previous studies using single nucleotide polymorphism (SNP) markers. A total of 97 significant InDels including a 6 bp insertion in an exon region were identified. It is worth noting that only 5 and 1 candidate genes for DY and TS populations were also detected in previous GWAS using SNPs, and numerous novel genes including c4b, fgf4, and dnajb9 were identified as vital candidate genes. Moreover, several novel growth‐related procedures, such as the growth and development of the bone and muscle, were also detected. These findings indicated that InDel‐based GWAS can provide valuable complement to SNP‐based studies. The comparison of genomic predictive performance for total length trait under different marker selection strategies and genomic selection models indicated that GWAS selection strategy exhibits more stable predictive performance compared to the evenly selection strategy. Additionally, support vector machine model demonstrated better predictive accuracy and efficiency than traditional best linear unbiased prediction and Bayes models. Furthermore, the superior predictive performance using InDel markers compared to SNP markers highlighted the potential of InDels to enhance genomic predictive accuracy and efficiency. Our results carry significant implications for dissecting genetic mechanisms and contributing genetic improvement of growth traits in spotted sea bass through genomic resources.
- New
- Research Article
- 10.1007/s13385-025-00437-4
- Nov 5, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00434-7
- Oct 9, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00427-6
- Aug 7, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00431-w
- Aug 5, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00429-4
- Aug 4, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00421-y
- Aug 1, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00426-7
- Jul 3, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00424-9
- Jul 2, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00423-w
- Jun 19, 2025
- European Actuarial Journal
- Research Article
- 10.1007/s13385-025-00425-8
- Jun 19, 2025
- European Actuarial Journal
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.