Borderline SMOTE Research Articles

Chronic subdural hematoma (CSDH) is a neurological condition with high recurrence rates, primarily observed in the elderly population. Although several risk factors have been identified, predicting CSDH recurrence remains a challenge. Given the potential of machine learning (ML) to extract meaningful insights from complex data sets, our study aims to develop and validate ML models capable of accurately predicting postoperative CSDH recurrence. Data from 447 CSDH patients treated with consecutive burr-hole irrigations at Wenzhou Medical University's First Affiliated Hospital (December 2014-April 2019) were studied. 312 patients formed the development cohort, while 135 comprised the test cohort. The Least Absolute Shrinkage and Selection Operator (LASSO) method was employed to select crucial features associated with recurrence. Eight machine learning algorithms were used to construct prediction models for hematoma recurrence, using demographic, laboratory, and radiological features. The Border-line Synthetic Minority Over-sampling Technique (SMOTE) was applied to address data imbalance, and Shapley Additive Explanation (SHAP) analysis was utilized to improve model visualization and interpretability. Model performance was assessed using metrics such as AUROC, sensitivity, specificity, F1 score, calibration plots, and decision curve analysis (DCA). Our optimized ML models exhibited prediction accuracies ranging from 61.0% to 86.2% for hematoma recurrence in the validation set. Notably, the Random Forest (RF) model surpassed other algorithms, achieving an accuracy of 86.2%. SHAP analysis confirmed these results, highlighting key clinical predictors for CSDH recurrence risk, including age, alanine aminotransferase level, fibrinogen level, thrombin time, and maximum hematoma diameter. The RF model yielded an accuracy of 92.6% with an AUC value of 0.834 in the test dataset. Our findings underscore the efficacy of machine learning algorithms, notably the integration of the RF model with SMOTE, in forecasting the recurrence of postoperative chronic subdural hematoma. Leveraging the RF model, we devised an online calculator that may serve as a pivotal instrument in tailoring therapeutic strategies and implementing timely preventive interventions for high-risk patients.

Read full abstract

Three-dimensional Mineral Prospectivity Mapping (3DMPM) is an innovative approach to mineral exploration that combines multiple geological data sources to create a three-dimensional (3D) model of a mineral deposit. It provides an accurate representation of the subsurface that can be used to identify areas with mineral potential. These 3D geological models are the typical data source for 3D prospective modeling. Geological data sets from multiple sources are used to construct 3D geological models. Since in practice there is a significant imbalance in the ratio of mineralized to non-mineralized classes, the classification results will be biased in favor of the more observed classes. Borderline-SMOTE (BLSMOTE) is an oversampling technique used to solve the problem of unbalanced datasets and works by generating synthetic data points along the boundary line between the minority and majority classes. This helps to create a more balanced dataset without introducing too much noise. Non-mineralized samples can be generated by randomly selecting non-mineralized locations, which means that uncertainties are generated. In this paper, we take the shallow-forming low-temperature hydrothermal deposit Guizhou Lannigou gold deposit as an example to extract the ore-controlling elements and establish a 3D geological model. A total of 50 training samples are generated using the sampling method described above, and 50 mineralization prospects are generated using Random Forests. A return–risk analysis was used to explore the uncertainties associated with synthetic positive samples and randomly selected negative samples, and to determine the final mineral potential values. Based on the evaluation metrics G-mean and F-value, the model using BLSMOTE outperforms the model without the synthetic algorithm and the models using SMOTE and KMeansSMOTE. The optimal model BLSMOTE18 has an AUC of 0.9288. The methodology also performs superiorly with different levels of class imbalance datasets. Excluding the predictions where the results highly overlap with known deposits, five target zones were circled for the targets using a P-A plot, all of which have obvious metallogenic geological features. Among them, Target1 and Target2 have good potential for mineralization, which is of great significance for future mineral exploration work.

Read full abstract

Borderline SMOTE Research Articles

Related Topics

Articles published on Borderline SMOTE

The development of classification-based machine-learning models for the toxicity assessment of chemicals associated with plastic packaging

BES-Optimized SMOTE Variant to Improve Dataset Scaling for Enhanced Privacy-Preserving Classification

X-ray Image Analysis for Dental Disease: A Deep Learning Approach Using EfficientNets

A unified Foot and Mouth Disease dataset for Uganda: evaluating machine learning predictive performance degradation under varying distributions.

Predictive Modeling of COVID-19 Readmissions: Insights from Machine Learning and Deep Learning Approaches.

Infusing Weighted Average Ensemble Diversity for Advanced Breast Cancer Detection

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering

Financial risk forewarning with an interpretable ensemble learning approach: An empirical analysis based on Chinese listed companies

Anomaly detection in wind turbine blades based on PCA and convolutional kernel transform models: employing multivariate SCADA time series analysis

An imbalance data quality monitoring based on SMOTE-XGBOOST supported by edge computing

Synthetic minority over-sampling technique-enhanced machine learning models for predicting recurrence of postoperative chronic subdural hematoma.

A comparative study in class imbalance mitigation when working with physiological signals.

Oversampling method via adaptive double weights and Gaussian kernel function for the transformation of unbalanced data in risk assessment of cardiovascular disease

HBMD-Net: Feature Fusion Based Breast Cancer Classification with Class Imbalance Resolution.

INTERPRETABLE MACHINE LEARNING FOR PREDICTING RISK OF INVASIVE FUNGAL INFECTION IN CRITICALLY ILL PATIENTS IN THE INTENSIVE CARE UNIT: A RETROSPECTIVE COHORT STUDY BASED ON MIMIC-IV DATABASE.

Investigating the impact of influential factors on crash types for autonomous vehicles at intersections

3D Mineral Prospectivity Mapping from 3D Geological Models Using Return–Risk Analysis and Machine Learning on Imbalance Data

Determining Resampling Ratios Using BSMOTE and SVM-SMOTE for Identifying Rare Attacks in Imbalanced Cybersecurity Data

A clustered borderline synthetic minority over-sampling technique for balancing quick access recorder data

Optimizing sleep staging on multimodal time series: Leveraging borderline synthetic minority oversampling technique and supervised convolutional contrastive learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Borderline SMOTE Research Articles

Related Topics

Articles published on Borderline SMOTE

The development of classification-based machine-learning models for the toxicity assessment of chemicals associated with plastic packaging

BES-Optimized SMOTE Variant to Improve Dataset Scaling for Enhanced Privacy-Preserving Classification

X-ray Image Analysis for Dental Disease: A Deep Learning Approach Using EfficientNets

A unified Foot and Mouth Disease dataset for Uganda: evaluating machine learning predictive performance degradation under varying distributions.

Predictive Modeling of COVID-19 Readmissions: Insights from Machine Learning and Deep Learning Approaches.

Infusing Weighted Average Ensemble Diversity for Advanced Breast Cancer Detection

Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering

Financial risk forewarning with an interpretable ensemble learning approach: An empirical analysis based on Chinese listed companies

Anomaly detection in wind turbine blades based on PCA and convolutional kernel transform models: employing multivariate SCADA time series analysis

An imbalance data quality monitoring based on SMOTE-XGBOOST supported by edge computing

Synthetic minority over-sampling technique-enhanced machine learning models for predicting recurrence of postoperative chronic subdural hematoma.

A comparative study in class imbalance mitigation when working with physiological signals.

Oversampling method via adaptive double weights and Gaussian kernel function for the transformation of unbalanced data in risk assessment of cardiovascular disease

HBMD-Net: Feature Fusion Based Breast Cancer Classification with Class Imbalance Resolution.

INTERPRETABLE MACHINE LEARNING FOR PREDICTING RISK OF INVASIVE FUNGAL INFECTION IN CRITICALLY ILL PATIENTS IN THE INTENSIVE CARE UNIT: A RETROSPECTIVE COHORT STUDY BASED ON MIMIC-IV DATABASE.

Investigating the impact of influential factors on crash types for autonomous vehicles at intersections

3D Mineral Prospectivity Mapping from 3D Geological Models Using Return–Risk Analysis and Machine Learning on Imbalance Data

Determining Resampling Ratios Using BSMOTE and SVM-SMOTE for Identifying Rare Attacks in Imbalanced Cybersecurity Data

A clustered borderline synthetic minority over-sampling technique for balancing quick access recorder data

Optimizing sleep staging on multimodal time series: Leveraging borderline synthetic minority oversampling technique and supervised convolutional contrastive learning