Advancing perioperative optimization in Crohn's disease surgery with machine learning predictions.
This editorial offers commentary on the article which aimed to forecast the likelihood of short-term major postoperative complications (Clavien-Dindo grade ≥ III), including anastomotic fistula, intra-abdominal sepsis, bleeding, and intestinal obstruction within 30 days, as well as prolonged hospital stays following ileocecal resection in patients with Crohn's disease (CD). This prediction relied on a machine learning (ML) model trained on a cohort that integrated a nomogram predictive model derived from logistic regression analysis and a random forest (RF) model. Both the nomogram and RF showed good performance, with the RF model demonstrating superior predictive ability. Key variables identified as potentially critical include a preoperative CD activity index ≥ 220, low preoperative serum albumin levels, and prolonged operation duration. Applying ML approaches to predict surgical recurrence have the potential to enhance patient risk stratification and facilitate the development of preoperative optimization strategies, ultimately aiming to improve post-surgical outcomes. However, there is still room for improvement, particularly by the inclusion of additional relevant clinical parameters, consideration of medical therapies, and potentially integrating molecular biomarkers in future research efforts.
- Research Article
182
- 10.3390/fire2030043
- Jul 28, 2019
- Fire
Recently, global climate change discussions have become more prominent, and forests are considered as the ecosystems most at risk by the consequences of climate change. Wildfires are among one of the main drivers leading to losses in forested areas. The increasing availability of free remotely sensed data has enabled the precise locations of wildfires to be reliably monitored. A wildfire data inventory was created by integrating global positioning system (GPS) polygons with data collected from the moderate resolution imaging spectroradiometer (MODIS) thermal anomalies product between 2012 and 2017 for Amol County, northern Iran. The GPS polygon dataset from the state wildlife organization was gathered through extensive field surveys. The integrated inventory dataset, along with sixteen conditioning factors (topographic, meteorological, vegetation, anthropological, and hydrological factors), was used to evaluate the potential of different machine learning (ML) approaches for the spatial prediction of wildfire susceptibility. The applied ML approaches included an artificial neural network (ANN), support vector machines (SVM), and random forest (RF). All ML approaches were trained using 75% of the wildfire inventory dataset and tested using the remaining 25% of the dataset in the four-fold cross-validation (CV) procedure. The CV method is used for dealing with the randomness effects of the training and testing dataset selection on the performance of applied ML approaches. To validate the resulting wildfire susceptibility maps based on three different ML approaches and four different folds of inventory datasets, the true positive and false positive rates were calculated. In the following, the accuracy of each of the twelve resulting maps was assessed through the receiver operating characteristics (ROC) curve. The resulting CV accuracies were 74%, 79% and 88% for the ANN, SVM and RF, respectively.
- Research Article
7
- 10.2196/38040
- Nov 2, 2022
- JMIR Cardio
BackgroundMany machine learning approaches are limited to classification of outcomes rather than longitudinal prediction. One strategy to use machine learning in clinical risk prediction is to classify outcomes over a given time horizon. However, it is not well-known how to identify the optimal time horizon for risk prediction.ObjectiveIn this study, we aim to identify an optimal time horizon for classification of incident myocardial infarction (MI) using machine learning approaches looped over outcomes with increasing time horizons. Additionally, we sought to compare the performance of these models with the traditional Framingham Heart Study (FHS) coronary heart disease gender-specific Cox proportional hazards regression model.MethodsWe analyzed data from a single clinic visit of 5201 participants of a cardiovascular health study. We examined 61 variables collected from this baseline exam, including demographic and biologic data, medical history, medications, serum biomarkers, electrocardiographic, and echocardiographic data. We compared several machine learning methods (eg, random forest, L1 regression, gradient boosted decision tree, support vector machine, and k-nearest neighbor) trained to predict incident MI that occurred within time horizons ranging from 500-10,000 days of follow-up. Models were compared on a 20% held-out testing set using area under the receiver operating characteristic curve (AUROC). Variable importance was performed for random forest and L1 regression models across time points. We compared results with the FHS coronary heart disease gender-specific Cox proportional hazards regression functions.ResultsThere were 4190 participants included in the analysis, with 2522 (60.2%) female participants and an average age of 72.6 years. Over 10,000 days of follow-up, there were 813 incident MI events. The machine learning models were most predictive over moderate follow-up time horizons (ie, 1500-2500 days). Overall, the L1 (Lasso) logistic regression demonstrated the strongest classification accuracy across all time horizons. This model was most predictive at 1500 days follow-up, with an AUROC of 0.71. The most influential variables differed by follow-up time and model, with gender being the most important feature for the L1 regression and weight for the random forest model across all time frames. Compared with the Framingham Cox function, the L1 and random forest models performed better across all time frames beyond 1500 days.ConclusionsIn a population free of coronary heart disease, machine learning techniques can be used to predict incident MI at varying time horizons with reasonable accuracy, with the strongest prediction accuracy in moderate follow-up periods. Validation across additional populations is needed to confirm the validity of this approach in risk prediction.
- Research Article
34
- 10.1097/tp.0000000000002923
- May 1, 2020
- Transplantation
Risk prediction plays an important role in clinical transplantation research. Traditionally, most risk models have been based on regression models.1 Although useful to help understand relationships between predictors and outcomes, these statistical methods can typically evaluate only a small number of predictors, which are assumed to affect everyone in the same way, and uniformly throughout the participants' lifespan. These methods have several limitations,2 including the inability to analyze nonlinear relationships, the requirement of setting a level of binary significance, impracticality for analyzing large datasets, and vulnerability to bias secondary to variable selection and/or omission of relevant confounders. With the emergence of P4 (Predictive, Preventive, Personalized, and Participatory) and Precision Medicine, artificial intelligence and machine learning methods have come to attention as methods aimed at solving the challenges in analysis not well addressed by regression approaches. Machine learning methods provide algorithms to understand patterns from large, complex, and heterogeneous data.3 Of the machine learning methods, recursive partitioning, and especially random forests, can deal with large numbers of predictor variables even in the presence of complex interactions.2,4 These methods have been applied successfully in genetics, clinical research, and bioinformatics. In this issue of Transplantation, Scheffner et al report on the development and internal validation of a random forest prediction model for patient survival.5 Random forest models are composed of a collection of decision trees. In the process of building each decision tree, different random subsets of the variables from the training dataset are selected to establish how best to partition the dataset at each node.6 Random forest models are considered less vulnerable to overfitting the training dataset given the large number of trees built, making each tree an independent model. The lower likelihood of bias is a result of bootstrapping several trees over randomly selected subsets of variables and subsamples of data.6 Random forest models require little preprocessing of data; the data need not be normalized; and the approach is resilient to outliers. While missing data will be a challenge when trying to draw clinical inferences from standard statistical models, machine learning methods tend to make fewer assumptions about the underlying data and, thus, are less vulnerable to the challenges associated with violation of those assumptions. Relying on fewer assumptions than regression analysis, machine learning methods have been shown to deliver more robust predictions. Scheffner and colleagues5 split a retrospective cohort of kidney transplant recipients with posttransplantation protocol biopsies into training and validation datasets (Figure 2A and B). Using all pretransplant and 3- and 12-months posttransplant variables, the obtained models showed good performance to predict death (concordance index: 0.77–0.78). Validation showed a concordance index of 0.76 and good discrimination of risks by the models, despite substantial differences in clinical variables and the derivation dataset representing an earlier era (2000–2007) than the validation dataset (2008–2013). To contrast with outputs of multivariable regression models using the same datasets, see Tables 2 and 3 and nomograms predicting mortality risk using estimators from multivariable Cox models (Figure 3) in Abeling et al.7 Random survival forests also inform on the importance of descriptive variables.6 Scheffner found the potentially modifiable (and highly correlated) graft rejection treatment and urinary tract infection to be important predictors of patient survival in addition to established factors like age, cardiovascular disease, diabetes, and graft function (Figure 3A and B).5 Many of the predictors retained in multivariable regression models7 were also deemed important in random forest survival analyses.5 To validate selected predictors and model construction, it is important to pursue external validation with independent datasets. Random survival forests may complement regression analyses when handling highly correlated complex survival data. Opportunities for application (and limitations) of each of the regression and random survival forests for prediction are summarized in Table 1.TABLE 1.: Regression and random survival forests for survival analysisPredictive models in transplantation and donation help risk stratify patients and could improve quality of healthcare delivery as well as patient outcomes. The increasing interest in these tools warrants a better understanding of their challenges and limitations.8 First, highly predictive variables may not necessarily be causally related to the outcomes of interest. Second, the success of machine learning models depends on the relationship between predictors and outcome being represented in training/validation datasets, the number of observations and features, selection and parameterization of features, and the algorithm chosen for the model. Careful variable definition (eg, urinary tract infection) is necessary. Presence of highly correlated linear and nonlinear relationships between independent variables may warrant mechanisms for removal of the correlated variables. Model performance may also be compromised when studying rare outcomes.4 Inevitably, generalizability of machine learning models may be limited when the clinical context, local factors (including patient/physician preferences, health systems, and care standards), and therapeutic strategies vary. To enable assessment of model validity, correct interpretation of model outputs, replication, and future knowledge synthesis, it is vital that the transplantation and donation community promote adherence to guidelines on the dissemination and reporting of machine learning models.8,9 Authors should be encouraged to report all model parameters, transformations applied to raw data, sampling methods, and random number generator seeds. Whenever possible, algorithms and associated code should be released in public software archive domains. There is a need for new models of health data ownership with rights to the individual, highly secure data repositories, government legislation for data sharing, and usage policies to ensure privacy and data security. Moreover, with wide uptake of machine learning and artificial intelligence tools, the scale of iatrogenic risks and liabilities related to their application, in contrast to the implications of a single doctor's mistake for a given patient, also warrant assessment.10 Most practice guidelines are geared toward the "average patient." Machine learning tools can capture the complexity of individual patients' characteristics and aid transplant clinicians with patient-specific care decisions. As these tools become more prevalent, it is important to develop best practice guidelines and ensure there is regulatory oversight on their development and application.
- Research Article
59
- 10.1093/icvts/ivv247
- Sep 10, 2015
- Interactive CardioVascular and Thoracic Surgery
A best evidence topic was written according to a structured protocol. The clinical question investigated was: is low serum albumin associated with postoperative complications in patients undergoing cardiac surgery? There were 62 papers retrieved using the reported search strategy. Of these, 12 publications embodied the best evidence to answer this clinical question. The authors, journal, date and country of the publication, patient group investigated, study design, relevant outcomes and results of these papers were tabulated. This paper includes a total of 12 589 patients, and of the papers reviewed, 4 were level 3 and 8 level 4. Each of the publications reviewed and compared either all or some of the following postoperative complications: mortality, postoperative bleeding requiring reoperation, prolonged hospital stay and ventilatory support, infection, liver dysfunction, delirium and acute kidney injury (AKI). Of the studies that examined postoperative mortality, all except for three established a significant multivariate association with low preoperative albumin level. Some scepticism is required in accepting other results that were only present in univariate analysis. While three studies examined multiple levels of serum albumin, most dichotomized the serum albumin levels into normal and abnormal groups. This led to differing classifications of hypoalbuminaemia, ranging from less than 2.5 to 4.0 g/dl. The available evidence, however, suggests that low preoperative serum albumin level in patients undergoing cardiac surgery is associated with the following: (i) increased risk of mortality after surgery and (ii) greater incidence of postoperative morbidity. While the evidence supports the use of preoperative albumin in assessing post-cardiac surgery complications, a specific level of albumin considered to be abnormal cannot be concluded from this review.
- Research Article
20
- 10.5664/jcsm.9630
- Aug 26, 2021
- Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine
Obstructive sleep apnea (OSA) is considered to be an important risk factor for the development of cardiovascular disease (CVD). This study aimed to develop and evaluate a machine learning approach with a set of features for assessing the 10-year CVD mortality risk of the OSA population. This study included 2,464 patients with OSA who met study inclusion criteria and were selected from the Sleep Heart Health Study. We evaluated the importance of potential features by mutual information. The top 9 features were selected to develop a random forest model. We evaluated the model performance on a test set (n = 493) using the area under the receiver operating curve with 95% confidence interval and confusion matrix. A random forest model awarded the highest area under the receiver operating curve of 0.84 (95% confidence interval: 0.78-0.89). The specificity and sensitivity were 73.94% and 81.82%, respectively. Sixty-three years old was a threshold for increased risk of 10-year CVD mortality. Persons with severe OSA had higher risk than those with mild OSA. This study demonstrated that a random forest model can provide a quick assessment of the risk of 10-year CVD mortality. Our model may be more informative for patients with OSA in determining their future CVD mortality risk. Li A, Roveda JM, Powers LS, Quan SF. Obstructive sleep apnea predicts 10-year cardiovascular disease-related mortality in the Sleep Heart Health Study: a machine learning approach. J Clin Sleep Med. 2022;18(2):497-504.
- Research Article
54
- 10.1142/s0192415x22500045
- Dec 2, 2021
- The American Journal of Chinese Medicine
Machine learning (ML), as a branch of artificial intelligence, acquires the potential and meaningful rules from the mass of data via diverse algorithms. Owing to all research of traditional Chinese medicine (TCM) belonging to the digitalization of clinical records or experimental works, a massive and complex amount of data has become an inextricable part of the related studies. It is thus not surprising that ML approaches, as novel and efficient tools to mine the useful knowledge from data, have created inroads in a diversity of scopes of TCM over the past decade of years. However, by browsing lots of literature, we find that not all of the ML approaches perform well in the same field. Upon further consideration, we infer that the specificity may inhere between the ML approaches and their applied fields. This systematic review focuses its attention on the four categories of ML approaches and their eight application scopes in TCM. According to the function, ML approaches are classified into four categories, including classification, regression, clustering, and dimensionality reduction, and into 14 models as follows in more detail: support vector machine, least square-support vector machine, logistic regression, partial least squares regression, k-means clustering, hierarchical cluster analysis, artificial neural network, back propagation neural network, convolutional neural network, decision tree, random forest, principal component analysis, partial least squares-discriminant analysis, and orthogonal partial least squares-discriminant analysis. The eight common applied fields are divided into two parts: one for TCM, such as the diagnosis of diseases, the determination of syndromes, and the analysis of prescription, and the other for the related researches of Chinese herbal medicine, such as the quality control, the identification of geographic origins, the pharmacodynamic material basis, the medicinal properties, and the pharmacokinetics and pharmacodynamics. Additionally, this paper discusses the function and feature difference among ML approaches when they are applied to the corresponding fields via comparing their principles. The specificity of each approach to its applied fields has also been affirmed, whereby laying a foundation for subsequent studies applying ML approaches to TCM.
- Research Article
13
- 10.1016/j.ajog.2013.08.037
- Aug 30, 2013
- American Journal of Obstetrics and Gynecology
Surgical site infections after hysterectomy among HIV-infected women in the HAART era: a single institution's experience from 1999–2012
- Research Article
1
- 10.46717/igj.57.2e.2ms-2024-11-11
- Nov 29, 2024
- The Iraqi Geological Journal
Facies collected from wells drilled in the study area were interpreted manually by using cores at every 10 meters of depth during well drilling. This depth of cores does not give true facies of all wells because the cores every 10 meters are considered very large. Extracting cores is financially expensive and takes a long time. The methodology of machine learning consists of four steps (Data gathering, Data preprocessing, Model training, and Model evaluation). This work intends to apply two of the supervised machine learning techniques random forest and decision tree models. (1) Data gathering, this dataset was collected from the Basrah Oil Company. It contains of ten wells (B-3, B-4, B-5, B-15, B-17, B-18, B-19, B-34, B-39, and B-40). Every well contains six features (logs): Sonic log, Resistivity Deep, Micro Spherically Focused Log, Neutron porosity, Density log, and Gamma-ray. Also, the dataset contains ten facies labels: mudstone, wack stone, packstone, roundstone, floatstone, shale, mud and wack stone, wack and pack stone, pack and grain stone and pack and float stone. These logs cover all the thickness of the Mishrif Formation, which is the goal of our study. (2) Preprocess, the data must be cleaned of outlier values; these values reduce the accuracy of the model during training. It is necessary at this stage to understand the relationships between all the features, because the highly correlated relationship between any two features, the more useless it will be in machine learning. The well B-5 blinded it for training to demonstrate the ability of machine learning models to predict lithofacies, then splitting randomly the dataset into 70% for training, validation 10%, and 20% for testing to verify the performance. (3) Training two models machine learning models decision tree and random forest. (4) Four statistics are computed for two models from the confusion matrix (accuracy of classification, recall, precision, and F1-score) showing the random forest was more accurate than the decision tree because the random forest model deals very well with this amount of dataset. Receiver Operating Characteristics curves of the random forest model have obtained the largest Area Under the Curve than the decision tree, is positive and above the main diagonal for all lithotypes, and the values for all classes reached more than 95%, except class 6 (shale), because of the 100% accuracy of classification. Facies classification by machine learning approach has two benefits (1) Increased accuracy of describing oil reservoirs and (2) Reducing the time, which a geologist needs to interpret logs data.
- Research Article
- 10.1093/ecco-jcc/jjac190.0207
- Jan 30, 2023
- Journal of Crohn's and Colitis
Background About half of patients with Crohn’s Disease (CD) require surgery in the course of treatment, furthermore, reoperation for later recurrence is also performed in high frequency. One of the characteristic findings for CD patients is creeping fat (CF) wrapped around inflamed areas of the intestine, and CF correlates with stenosis of the intestinal tract resulting from changes in the connective tissue of the intestinal wall. Lower postoperative recurrences have been reported among CD patients who underwent extensive CF resection, suggesting that CF would be clinically important. However, the detail mechanisms are still unknown. The aim of this study is to examine the functional mechanism and clinical significance of CF focused on innate lymphoid cells (ILCs) and macrophages. Methods Normal mesentery (NM) of ileum was obtained from microscopically and macroscopically intact areas in patients with colorectal cancer. Mesentery was also obtained from surgically resected specimens from patients with CD who were diagnosed based on established clinical, radiologic, and endoscopic criteria. Mesenteric stromal vascular fraction were isolated by enzymatic digestion. Mesentery from CD patients divided CF and areas with no or slight inflammation (non-CF). Early recurrence was defined as Rutgeerts score (≥2) or worsening of clinical symptoms (CDAI≥220). Results RNA-seq of CF revealed that expression of immune-related genes and of fibrosis-related genes were elevated compared with NM and non-CF. Both frequency and numbers of ILC1 were increased, and those of ILC2 and ILC3 were decreased in CF compared with NM. Positive associations between ILC1 and inflammatory macrophages (CD14+ CD163low) related to fat fibrosis were observed in frequency and number per unit weight. FACS and RT-PCR showed ILC1 from CF had higher expression of IFNγ than those from non-CF. Co-culture experiment of ILC1 from CF and SVF from NM showed increased expression of fibrosis associated genes (COL1A, COL3A), inflammatory markers of macrophages (iNOS, Mincle), and TGFB1, and these effects were cancelled by IFNγ-neutralizing antibody. Evaluation of risk factors of early recurrence showed high frequency of ILC1 from CF and low preoperative serum albumin level were significantly associated with early recurrence (P = 0.004, P = 0.04, each). There was a significant difference between the ILC1 high (≥ 80 %) and ILC1 low (< 80 %) from CF in terms of early recurrence-free survival (P = 0.02). Conclusion In CF, ILC1 plays a certain role in macrophage function related to fat fibrosis via IFNγ production. CD patients with high ILC1 (≥80%) in the CF have tendency of early recurrence.
- Research Article
132
- 10.1016/s0002-9610(88)80256-3
- Jan 1, 1988
- The American Journal of Surgery
Relationship of postoperative septic complications and blood transfusions in patients with Crohn's disease
- Research Article
4
- 10.1016/j.rcsop.2023.100307
- Jul 10, 2023
- Exploratory Research in Clinical and Social Pharmacy
Assessing treatment switch among patients with multiple sclerosis: A machine learning approach
- Research Article
- 10.52436/1.jutif.2025.6.4.5204
- Sep 2, 2025
- Jurnal Teknik Informatika (Jutif)
The geographically weighted regression (GWR) model has been widely used in various types of predictions, including human development index predictions. Similarly, the random forests (RF) model has also been widely used in various value predictions. The GWR model always assumes a local linear relationship between dependent and independent variables. The RF model only produces one global model that cannot represent conditions at each location. The GWR model is susceptible to multicollinearity in each independent variable, which can lead to overfitting if multicollinearity in the model is high. To address the vulnerability of the GWR model to multicollinearity, the RF model and the GWR model can be combined. Since the RF model is not vulnerable to multicollinearity in the independent variables, the modification becomes the geographically weighted random forests (GWRF) model to improve the shortcomings of the GWR and RF models. The GWR and GWRF models were constructed using data from districts and cities in Central Java Province, which was selected as the study area due to evident disparities in human development index achievements. These disparities highlight the presence of spatial heterogeneity that conventional models fail to adequately capture. To rigorously evaluate model performance, data from 2023 were employed as training data, while data from 2024 served as testing data. This research introduces a novel integration of spatial econometric and machine learning approaches, providing a more robust framework for addressing complex spatial variations in human development outcomes. The GWRF model is capable of producing a model that does not overfit when there is multicollinearity among independent variables. The GWRF model offers a novel integration of machine learning and spatial modelling, outperforming both GWR and RF by not only delivering high predictive accuracy under complex variable relationships but also capturing nuanced local spatial heterogeneity that conventional approaches fail to address.
- Research Article
2
- 10.33330/jurteksi.v10i2.2993
- Mar 29, 2024
- JURTEKSI (Jurnal Teknologi dan Sistem Informasi)
Abstract: Gross Regional Domestic Product (GRDP) is one of the most important socio-economic indicators. In order to gain a more comprehensive understanding of the current economic situation and regional differences, estimating GRDP using integration of satellite imagery and official statistics data can provide valuable information. This research estimates the GRDP value in 2022 by using data in 2019 to 2021 related to two aspects, agriculture and non-agriculture. Soil adjusted vegetation index (SAVI), enhanced vegetation index (EVI), and land cover (LC) used as agriculture aspect, while nighttime light (NTL), human settlement index (HSI), land area, and population per regency/city used as non-agriculture aspect. GRDP estimation are produced with machine learning approach using support vector machine (SVM) and random forest (RF) method. Correlation test on each variable shows only land area that does not have a significant correlation with GRDP. RF model then chosen as the best model with RMSE, MSE, MAE, and R2 value of 0.2549; 0.5049; 0.7727; and 0.2543, respectively. The estimated values acquired in several regencies/cities have rather near, some even very close to the official statistics values. Keywords: GRDP; satellite imagery; machine learning; random forest; support vector machine Abstrak: Produk Domestik Regional Bruto (PDRB) merupakan salah satu indikator sosio-ekonomi yang penting. Penghitungan nilai PDRB dengan pendekatan yang melibatkan kombinasi data citra satelit dan statistik resmi dapat memberikan informasi serta pemahaman yang lebih komprehensif. Penelitian ini melakukan estimasi nilai PDRB pada tahun 2022 menggunakan data tahun 2019 hingga 2021 dengan melibatkan dua aspek, agrikultur dan non-agrikultur. Data soil adjusted vegetation index (SAVI), enhanced vegetation index (EVI), dan tutupan lahan (land cover/LC) digunakan sebagai aspek agrikultur, sementara data citra cahaya malam (NTL), human settlement indeks (HSI), luas wilayah kabupaten/kota, dan jumlah populasi per kabupaten/kota digunakan sebagai aspek non-agrikultur. Estimasi PDRB dihasilkan dengan menggunakan pendekatan machine learning berupa support vector machine (SVM) dan random forest (RF). Pengecekan korelasi antarvariabel menunjukkan bahwa hanya variabel luas wilayah tidak berpengaruh signifikan terhadap nilai PDRB. Model random forest kemudian dipilih sebagai model terbaik dengan nilai evaluasi RMSE, MSE, MAE, dan berturut-turut sebesar 0.2549, 0.5049, 0.7727, dan 0.2543. Nilai estimasi yang diperoleh di beberapa kabupaten/kota cukup mendekati, bahkan ada yang sangat dekat dengan nilai statistik resmi. Kata kunci: PDRB; citra satelit; machine learning; random forest; support vector machine
- Research Article
15
- 10.3390/jcm10215021
- Oct 28, 2021
- Journal of Clinical Medicine
Background: Lactic acidosis is the most common cause of anion gap metabolic acidosis in the intensive care unit (ICU), associated with poor outcomes including mortality. We sought to compare machine learning (ML) approaches versus logistic regression analysis for prediction of mortality in lactic acidosis patients admitted to the ICU. Methods: We used the Medical Information Mart for Intensive Care (MIMIC-III) database to identify ICU adult patients with lactic acidosis (serum lactate ≥4 mmol/L). The outcome of interest was hospital mortality. We developed prediction models using four ML approaches consisting of random forest (RF), decision tree (DT), extreme gradient boosting (XGBoost), artificial neural network (ANN), and statistical modeling with forward stepwise logistic regression using the testing dataset. We then assessed model performance using area under the receiver operating characteristic curve (AUROC), accuracy, precision, error rate, Matthews correlation coefficient (MCC), F1 score, and assessed model calibration using the Brier score, in the independent testing dataset. Results: Of 1919 lactic acidosis ICU patients, 1535 and 384 were included in the training and testing dataset, respectively. Hospital mortality was 30%. RF had the highest AUROC at 0.83, followed by logistic regression 0.81, XGBoost 0.81, ANN 0.79, and DT 0.71. In addition, RF also had the highest accuracy (0.79), MCC (0.45), F1 score (0.56), and lowest error rate (21.4%). The RF model was the most well-calibrated. The Brier score for RF, DT, XGBoost, ANN, and multivariable logistic regression was 0.15, 0.19, 0.18, 0.19, and 0.16, respectively. The RF model outperformed multivariable logistic regression model, SOFA score (AUROC 0.74), SAP II score (AUROC 0.77), and Charlson score (AUROC 0.69). Conclusion: The ML prediction model using RF algorithm provided the highest predictive performance for hospital mortality among ICU patient with lactic acidosis.
- Research Article
6
- 10.1016/j.ejim.2025.01.020
- Mar 1, 2025
- European journal of internal medicine
Prediction of mortality in heart failure by machine learning. Comparison with statistical modeling.