Prediction models developed using artificial intelligence: similar predictive performances with highly varying predictions for individuals – an illustration in deep vein thrombosis
ObjectivesThe rise in popularity and off-the-shelf availability of machine learning (ML) and AI-based methodology to develop new prediction models provides developers with ample choices to compare and select the best performing model out of many possible models. Many studies have shown that such comparisons on any particular dataset, the difference in performance between models developed using different techniques (e.g. logistic regression, vs. random forest or neural networks) can often be small, especially when looking at crude performance measures such as the area under the ROC curve. This may lead to the conclusion that such models are essentially exchangeable, and model selection is arbitrary. However, as we will illustrate using a dataset on deep venous thrombosis, prediction models with similar discriminative performance may nonetheless generate different outcome probability estimates for individual patients and potentially lead to meaningfully different decision making.MethodsWe developed diagnostic prediction models to predict the presence of deep venous thrombosis (DVT) in a large dataset of patients with leg symptoms suspected of having DVT, using five modelling techniques: unpenalized logistic regression (ULR), ridge logistic regression (RLR), random forests (RF), support vector machine (SVM) and neural network (NN). Age, sex, d-dimer, history of DVT, diagnosis alternative to DVT, and having cancer were used as a fixed set of predictors. Model performance was evaluated in terms of discrimination, calibration, and stability of individual risk prediction for a set of patients across the models.ResultsOf the 6,087 suspected patients, 1,146 (19%) were diagnosed with DVT based on leg ultrasound (reference test). Three prediction models (ULR, RLR, NN) had similar discrimination with AUCs point estimates of 0.84. However, the 6087 individuals’ estimated probabilities of DVT varied substantially across the five different modelling techniques, highlighting differences in prediction stability. Notably, the RF model tended to overestimate individual risks, while the SVM model tended to underestimate them compared to the other models. While the estimated probabilities were more similar for ULR, RLR and NN, classification measures (sensitivity, specificity, positive and negative predictive value) did differ because of differences in estimated probabilities of individuals near the risk threshold, illustrating that differences, even when relatively small, could potentially lead to different clinical decisions.ConclusionsPrediction models developed with different modeling techniques yielded very different individuals’ outcome probabilities, even though the models had similar discriminative performance in this low-dimensional setting. Part of this variation can be explained by differences in calibration but also from modelling choices as estimated risks also differed for modelling techniques with similar calibration performance. Hence, our findings highlight the impact of the choice of modelling techniques on model performance, individual estimated probabilities and consequently the impact of that choice on risk-based clinical decision making.Supplementary InformationThe online version contains supplementary material available at 10.1186/s41512-025-00216-5.
- Research Article
34
- 10.1016/j.avsg.2013.08.012
- Nov 7, 2013
- Annals of Vascular Surgery
Deep Venous Thrombosis after Saphenous Endovenous Radiofrequency Ablation: Is it Predictable?
- Research Article
23
- 10.1007/s11357-011-9265-x
- May 20, 2011
- AGE
Arterial and venous thrombosis have always been regarded as different pathologies and epidemiological studies have examined the association between venous thrombosis and indicators of atherosclerosis and/or arterial thromboembolic events. We measured the flow-mediated dilation (FMD), a well-known marker of arterial endothelial dysfunction, in young-middle-aged and old-aged patients with and without unprovoked deep venous thrombosis (DVT). The aim of this study was to investigate whether DVT was a significant predictor for impaired FMD, considering all the patients and young-middle-aged (age < 65 years) and old-aged (age ≥ 65 years) patients separately. FMD was measured in the brachial artery on a population of 120 subjects with the same atherosclerosis risk factors, 68 male and 52 female, 70 young-middle-aged subjects (mean age ± SD 49.5 ± 10.5 years) and 50 old-aged subjects (76.2 ± 7.7 years). Patients with DVT showed a significant decrease of FMD compared to patients without DVT (6.8 ± 5.5% vs. 10.9 ± 3.5%, p < 0.001). Moreover, old-aged patients showed a significant decrease of FMD compared to the young-middle-aged subjects (7.4 ± 4.1% vs. 9.8 ± 5.3%, p = 0.005). In the whole study population, DVT was strongly associated with FMD (risk factors adjusted β = -4.14, p < 0.001). A significant interaction between age and the presence of DVT on predicting FMD was found (p = 0.003) suggesting a differential behavior of DVT as predictor of FMD. In young-middle-aged group, multivariate model confirmed that DVT was the most significant predictor of continuous FMD (β = -6.06, p < 0.001). On the contrary, DVT was no more a predictor of FMD in the old age group (β = -0.73, p = 0.556). Furthermore, old-aged patients without DVT showed a statistically significant decrease of FMD compared to the young-middle-aged subjects without DVT (8.2 ± 2.1% vs. 12.6 ± 2.7%, p<0.001) and old-aged patients with DVT showed a not statistically significant decrease of the FMD compared to the young-middle-aged patients with DVT (6.7 ± 5.3% vs. 6.8 ± 5.7%, p = 0.932). In conclusion, young-middle-aged patients with spontaneous DVT show an impaired FMD, whereas this impairment in old-aged subjects is evident independently from the presence or absence of DVT. Aging per se may be associated with physiologic abnormalities in the systemic arteries and with endothelial dysfunction.
- Research Article
8
- 10.1007/s11606-017-4170-3
- Sep 15, 2017
- Journal of General Internal Medicine
The Wells score for deep venous thrombosis (DVT) has a high failure rate and low efficiency among inpatients. To create and validate an inpatient-specific risk stratification model to help assess pre-test probability of DVT in hospitalized patients. Prospective cohort study of hospitalized patients undergoing lower-extremity ultrasonography studies (LEUS) for suspected DVT. Demographics, physical findings, medical history, medications, hospitalization, and laboratory and imaging results were collected. Samples were divided into model derivation (patients undergoing LEUS 11/1/2012-12/31/2013) and validation cohorts (LEUS 1/1/2014-5/31/2015). A DVT prediction rule was derived using the recursive partitioning algorithm (decision tree-type approach) and was then validated. Adult inpatients undergoing LEUS for suspected DVT from November 2012 to May 2015, excluding those with DVT in the prior 3months, at a 793-bed, urban academic quaternary-care hospital with ~50,000 admissions annually. The primary outcome was the presence of proximal DVT, and the secondary outcome was the presence of any DVT (proximal or distal). Model sensitivity and specificity for predicting DVT were calculated. Recursive partitioning yielded four variables (previous DVT, active cancer, hospitalization ≥ 6days, age≥46 years) that optimized the prediction of proximal DVT and yield in the derivation cohort. From this decision tree, we stratified a scoring system using the validation cohort, categorizing patients into low- and high-risk groups. The incidence rates of proximal DVT were 2.9% and 12.0%, and of any DVT were 5.2% and 21.0%, for the low- and high-risk groups, respectively. The AUC for the discriminatory accuracy of the Center for Evidence-Based Imaging (CEBI) score for risk of proximal DVT identified on LEUS was 0.73. Model sensitivity was 98.1% for proximal and 98.1% for any DVT. In hospitalized adults, specific factors can help clinicians predict risk of DVT, identifying those with low pre-test probability, in whom ultrasonography can be safely avoided.
- Research Article
58
- 10.1016/s0039-6109(16)45585-4
- Dec 1, 1991
- Surgical Clinics of North America
Deep Venous Thrombosis and Pulmonary Embolism
- Research Article
5
- 10.1016/j.jstrokecerebrovasdis.2012.02.009
- Mar 16, 2012
- Journal of Stroke and Cerebrovascular Diseases
Association of Deep Venous Thrombosis with Calf Vein Diameter in Acute Hemorrhagic Stroke
- Research Article
- 10.34172/jrcm.2021.038
- Dec 23, 2021
- Journal of Research in Clinical Medicine
Background: In this study, we aimed to evaluate computed tomography(CT) findings of peripheral pulmonary artery aneurysms(PPAA) associated with Behcet Disease(BD), Hughes Stovin Syndrome(HSS), and idiopathic origin. Methods: Contrast-enhanced CT scans of the patients were retrospectively reviewed regarding PPAA. The patients with PPA were classified into BD, HSS, and idiopathic groups according to the etiology. The groups were compared for demographical features including age and gender, multiplicity, distribution thrombosis and accompanying pulmonary artery embolism(PAE), and deep venous thrombosis(DVT) history. Results: A total of 30 PPAA (25.4±13.4 (11-62) mm) were detected in 10(2.3%) (mean age 39.8±22-1.0[8-73] years, female/male:3/7) among 4391 patients reviewed. In 7 patients multiple aneurysms were detected while in 3 a solitary lesion was seen. Most commonly lower lobes (right 8-left 8, 53.2%) involvement was observed. A thrombosis was detected within 19(63.4%) aneurysms. Among 10 patients with PPAA 4(40%) ones have BD, 2(20%) HS and 4(40%) idiopathic origin. In 5(50.%) patients there was accompanying PAE and 3 (30%) DVT history. Patients with BD nad HSS tended to have multiple lesions than with idiopathic origin. Accompanying PAE was observed in 2(100%) of HSS, 2(50%) BD, and 1(25%) patient in the idiopathic group. A DVT history was recorded in 2(100%) of HSS, 1(25%) BD. None of in the idiopathic group had a DVT history. The only rupture was observed in the HSS group. Conclusion: Vasculitic diseases lead to PPAA, including BD and HSS are more likely to be associated with complications and additional morbid conditions than idiopathic processes.
- Front Matter
94
- 10.1016/j.arth.2005.01.014
- Apr 1, 2005
- The Journal of Arthroplasty
Prophylaxis for Thromboembolic Disease: Recommendations From the American College of Chest Physicians—Are They Appropriate for Orthopaedic Surgery?
- Research Article
14
- 10.1016/s0002-9610(99)80286-4
- Aug 1, 1995
- The American Journal of Surgery
Selective use of the duplex scan in diagnosis of deep venous thrombosis
- Research Article
13
- 10.1016/j.jvsv.2020.12.071
- Dec 16, 2020
- Journal of Vascular Surgery: Venous and Lymphatic Disorders
Elevated plasma factor VIII levels in a mixed patient population on anticoagulation and past venous thrombosis
- Research Article
4
- 10.1007/s11239-024-03010-0
- Jul 27, 2024
- Journal of thrombosis and thrombolysis
This study aimed to apply machine learning (ML) techniques to develop and validate a risk prediction model for post-stroke lower extremity deep vein thrombosis (DVT) based on patients' limb function, activities of daily living (ADL), clinical laboratory indicators, and DVT preventive measures. We retrospectively analyzed 620 stroke patients. Eight ML models-logistic regression (LR), support vector machine (SVM), random forest (RF), decision tree (DT), neural network (NN), extreme gradient boosting (XGBoost), Bayesian (NB), and K-nearest neighbor (KNN)-were used to build the model. These models were extensively evaluated using ROC curves, AUC, PR curves, PRAUC, accuracy, sensitivity, specificity, and clinical decision curves (DCA). Shapley's additive explanation (SHAP) was used to determine feature importance. Finally, based on the optimal ML algorithm, different functional feature set models were compared with the Padua scale to select the best feature set model. Our results indicated that the RF algorithm demonstrated superior performance in various evaluation metrics, including AUC (0.74/0.73), PRAUC (0.58/0.58), accuracy (0.75/0.77), and sensitivity (0.78/0.80) in both the training set and test set. DCA analysis revealed that the RF model had the highest clinical net benefit. SHAP analysis showed that D-dimer had the most significant influence on DVT, followed by age, Brunnstrom stage (lower limb), prothrombin time (PT), and mobility ability. The RF algorithm can predict post-stroke DVT to guide clinical practice.
- Research Article
5
- 10.1111/j.1600-0609.1984.tb02397.x
- Aug 1, 1984
- Scandinavian journal of haematology
In a prospective study, antithrombin III (AT III) was performed preoperatively, peroperatively immediately after the surgical procedure and daily during the 8 postoperative days in 57 consecutive patients who underwent major abdominal surgery without prophylactic anticoagulant therapy. On d 8, according to the results of a bilateral radiological phlebography, the patients were divided into 2 groups: Group I: presence of deep venous thrombosis (DVT): n = 28 (49%) and Group 2: absence of deep venous thrombosis: n = 29 (52%). The results of the study showed that the preoperative AT III value did not constitute a marker of the postoperative DVT risk. During the postoperative period, AT III level decreased immediately following the intervention and resumed its preoperative value within 8 d. Nevertheless, this evolution was not different in the 2 groups and was not related to the presence of postoperative DVT.
- Research Article
- 10.2147/clep.s501062
- Feb 1, 2025
- Clinical epidemiology
Psychiatric inpatients face an increased risk of deep vein thrombosis (DVT) due to their psychiatric conditions and pharmacological treatments. However, research focusing on this population remains limited. This study analyzed 17,434 psychiatric inpatients at Huzhou Third Municipal Hospital, incorporating data on demographics, psychiatric diagnoses, physical illnesses, laboratory results, and medication use. Predictive models for DVT were developed using logistic regression, random forest, support vector machine (SVM), and XGBoost (Extreme Gradient Boosting). Feature importance was assessed using the random forest model. The DVT incidence among psychiatric inpatients was 1.6%. Predictive model performance, measured by the area under the curve (AUC), showed logistic regression (0.900), random forest (0.885), SVM (0.890), and XGBoost (0.889) performed well. Logistic regression and random forest models exhibited optimal overall performance, while XGBoost excelled in recall. Significant predictors of DVT included elevated D-dimer levels, age, Alzheimer's disease, and Madopar use. Psychiatric inpatients require vigilance for DVT risk, with factors like D-dimer levels and age serving as critical indicators. Machine learning models effectively predict DVT risk, enabling early detection and personalized prevention strategies in clinical practice.
- Research Article
- 10.1161/circ.142.suppl_3.17258
- Nov 17, 2020
- Circulation
Background: D-Dimer values may be elevated in hyperinflammatory or prothrombotic states and are frequently measured in patients with coronavirus disease 2019 (COVID-19). Many institutional algorithms and ongoing studies suggest using D-Dimer cutoffs to initiate anticoagulation. The relationship between D-Dimer levels and deep venous thrombosis (DVT) has not been extensively studied specifically in patients with COVID-19. Methods: We retrospectively studied patients hospitalized at our institution between 2/1/20-5/19/20 for COVID-19 who underwent lower extremity venous doppler imaging. After stratifying by presence of DVT, baseline characteristics, vital signs, and laboratory values were assessed. We assessed the association between peak D-Dimer levels and diagnosis of DVT during admission. Upper limit D-Dimer value for the hospital’s laboratory assay was >20 mg/dL. Results: Of the 2677 patients admitted, 514 underwent lower extremity imaging, out of whom 186 (36.2%) were diagnosed with DVT. Other than history of cancer, which was more common in patients with a diagnosis of DVT (14.7% vs. 6.3%, p<0.01), baseline characteristics and presentation vital signs were similar between groups. Median peak D-Dimer levels were similar in patients with and without diagnosis of DVT [18.5 mg/dL, IQR: 6.4-20.0 vs. 12.2 mg/dL, IQR: 3.7-20, p = 0.80]. Density plots of initial D-Dimer values grouped by presence of DVT are presented in Figure 1. Conclusions: In this analysis of patients hospitalized with COVID-19, DVT was frequently diagnosed in patients who underwent imaging. There was considerable overlap of peak D-Dimer values in patients with and without documented DVT. As such, elevation in D-Dimer values alone should not prompt routine initiation of therapeutic anticoagulation in COVID-19 patients. Data from prospective clinical trials and registries regarding optimal antithrombotic practices in this patient population is needed.
- Research Article
12
- 10.3390/jcm9051257
- Apr 26, 2020
- Journal of Clinical Medicine
This study was performed to investigate the relationship between patients’ activity and function levels and the incidence of preoperative deep venous thrombosis (DVT) prior to total hip arthroplasty (THA). We retrospectively reviewed 500 patients admitted for primary or revision THA from July 2014 to October 2018. The diagnosis of DVT was confirmed using Doppler ultrasonography 1 month before THA. The patients’ activity and hip function were evaluated using several clinical scores: the Harris Hip Score (HHS), Oxford Hip Score (OHS), University of California Los Angeles (UCLA) activity score, and visual analog scale (VAS) score. Those scores and the medical history were examined for correlations with preoperative DVT using univariate and multivariate models. Univariate regression analysis showed that older age, current steroid use, anticoagulant use, a history of DVT, collagen disease, a lower UCLA activity score, and a lower OHS were associated with an elevated risk of preoperative DVT. The multivariate analyses showed that a higher UCLA activity score (odds ratio (OR): 0.0049–0.012) and higher OHS (OR: 0.0012–0.0088) were associated with a lower risk of preoperative DVT in each model. Age (OR: 1.07 in both models), current steroid use (OR: 9.32–10.45), and a history of DVT (OR: 27.15–74.98) were associated with a higher risk of preoperative DVT in both models. Older age, current steroid use, a history of DVT, a lower UCLA activity score, and a lower OHS were risk factors for preoperative DVT before THA, even when controlling for potential confounders. Patients exhibiting low activity and low function levels were more likely to have DVT, even before surgery.
- Research Article
- 10.3389/fsurg.2025.1648645
- Jan 1, 2025
- Frontiers in Surgery
BackgroundLower extremity deep vein thrombosis (DVT) represents a prevalent and formidable complication among patients with gastrointestinal malignancies, exerting a profound impact on both prognosis and quality of life. Owing to its intricate pathogenesis, the development of a precise risk prediction model is imperative for advancing clinical strategies in prevention and therapeutic intervention.MethodsThis retrospective study enrolled patients with gastrointestinal malignancies using multicenter, longitudinal clinical data obtained from three tertiary medical centers between 2020 and 2024. A total of 34 variables were extracted, encompassing demographic profiles, clinical parameters, tumor-specific characteristics, and laboratory indices. To identify independent predictors of DVT, both univariate and multivariate analyses were initially performed. Four machine learning algorithms—Extreme Gradient Boosting (XGBoost), Random Forest (RF), Support Vector Machine (SVM), and k-Nearest Neighbors (KNN)—were subsequently constructed to predict DVT risk. Model performance was rigorously assessed through receiver operating characteristic (ROC) curves, calibration plots, Brier scores, and decision curve analysis (DCA). Internal validation was conducted via ten-fold cross-validation, while an independent external cohort was employed to evaluate model generalizability. To elucidate the underlying predictive mechanisms, SHapley Additive exPlanations (SHAP) analysis was carried out.ResultsThrough a combination of univariate and multivariate analyses alongside four machine learning algorithms, surgery, prolonged immobilization, central venous catheterization, radiotherapy, distant metastasis, and chemotherapy emerged as significant high-risk factors for DVT. All four predictive models exhibited robust performance, with the XGBoost model demonstrating superior discrimination, calibration, and clinical utility. Findings from the external validation cohort further substantiated its stability and generalizability. SHAP analysis illuminated the relative contributions and directional influences of pivotal variables within the predictive framework.ConclusionMachine learning models derived from multicenter, longitudinal clinical datasets offer robust predictive capabilities for assessing DVT risk in patients with gastrointestinal malignancies. These models furnish clinicians with individualized risk stratification tools, facilitating the refinement of preventive strategies and the enhancement of clinical decision-making, ultimately contributing to improved patient management.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.