Imbalanced Medical Data Research Articles

ObjectiveData imbalance is a pervasive issue in medical data mining, often leading to biased and unreliable predictive models. This study aims to address the urgent need for effective strategies to mitigate the impact of data imbalance on classification models. We focus on quantifying the effects of different imbalance degrees and sample sizes on model performance, identifying optimal cut-off values, and evaluating the efficacy of various methods to enhance model accuracy in highly imbalanced and small sample size scenarios.MethodsWe collected medical records of patients receiving assisted reproductive treatment in a reproductive medicine center. Random forest was used to screen the key variables for the prediction target. Various datasets with different imbalance degrees and sample sizes were constructed to compare the classification performance of logistic regression models. Metrics such as AUC, G-mean, F1-Score, Accuracy, Recall, and Precision were used for evaluation. Four imbalance treatment methods (SMOTE, ADASYN, OSS, and CNN) were applied to datasets with low positive rates and small sample sizes to assess their effectiveness.ResultsThe logistic model’s performance was low when the positive rate was below 10% but stabilized beyond this threshold. Similarly, sample sizes below 1200 yielded poor results, with improvement seen above this threshold. For robustness, the optimal cut-offs for positive rate and sample size were identified as 15% and 1500, respectively. SMOTE and ADASYN oversampling significantly improved classification performance in datasets with low positive rates and small sample sizes.ConclusionsThe study identifies a positive rate of 15% and a sample size of 1500 as optimal cut-offs for stable logistic model performance. For datasets with low positive rates and small sample sizes, SMOTE and ADASYN are recommended to improve balance and model accuracy.

Read full abstract

ObjectiveLung image classification-assisted diagnosis has a large application market. Aiming at the problems of poor attention to existing translation models, the insufficient ability of key transfer and generation, insufficient quality of generated images, and lack of detailed features, this paper conducts research on lung medical image translation and lung image classification based on generative adversarial networks. MethodsThis paper proposes a medical image multi-domain translation algorithm MI-GAN based on the key migration branch. After the actual analysis of the imbalanced medical image data, the key target domain images are selected, the key migration branch is established, and a single generator is used to complete the medical image multi-domain translation. The conversion between domains ensures the attention performance of the medical image multi-domain translation model and the quality of the synthesized images. At the same time, a lung image classification model based on synthetic image data augmentation is proposed. The synthetic lung CT medical images and the original real medical images are used as the training set together to study the performance of the auxiliary diagnosis model in the classification of normal healthy subjects, and also of the mild and severe COVID-19 patients. ResultsBased on the chest CT image dataset, MI-GAN has completed the mutual conversion and generation of normal lung images without disease, viral pneumonia and Mild COVID-19 images. The synthetic images GAN-test and GAN-train indicators reached, respectively 92.188% and 85.069%, compared with other generative models in terms of authenticity and diversity, there is a considerable improvement. The accuracy rate of pneumonia diagnosis of the lung image classification model is 93.85%, which is 3.1% higher than that of the diagnosis model trained only with real images; the sensitivity of disease diagnosis is 96.69%, a relative improvement of 7.1%. 1%, the specificity was 89.70%; the area under the ROC curve (AUC) increased from 94.00% to 96.17%. ConclusionIn this paper, a multi-domain translation model of medical images based on the key transfer branch is proposed, which enables the translation network to have key transfer and attention performance. It is verified on lung CT images and achieved good results. The required medical images are synthesized by the above medical image translation model, and the effectiveness of the synthesized images on the lung image classification network is verified experimentally.

Read full abstract

Imbalanced Medical Data Research Articles

Related Topics

Articles published on Imbalanced Medical Data

A Comprehensive Analysis of a Framework for Rebalancing Imbalanced Medical Data Using an Ensemble-based Classifier

Effect of Random Under sampling, Oversampling, and SMOTE on the Performance of Cardiovascular Disease Prediction Models

Processing imbalanced medical data at the data level with assisted-reproduction data as an example

A hybrid feature weighting and selection-based strategy to classify the high-dimensional and imbalanced medical data

Cost-sensitive learning for imbalanced medical data: a review

Improving Classification Performance on Imbalanced Medical Data using Generative Adversarial Network

A Hybrid Deep Learning Approach for Epileptic Seizure Detection in EEG signals.

The Performance Comparison between C4.5 Tree and One-Dimensional Convolutional Neural Networks (CNN1D) with Tuning Hyperparameters for the Classification of Imbalanced Medical Data

CHARACTERIZATION OF MORTALITY PREDICTION: AN ENSEMBLE LEARNING ANALYSIS USING THE MIMIC-III DATASET

Imbalanced Multiclass Medical Data Classification based on Learning Automata and Neural Network

An Oversampling Algorithm combining SMOTE and RF for Imbalanced Medical Data

Two Directions for Clinical Data Generation with Large Language Models: Data-to-Label and Label-to-Data.

Flexible cloglog links for binomial regression models as an alternative for imbalanced medical data.

BiLSTM deep neural network model for imbalanced medical data of IoT systems

Personalized Retrogress-Resilient Federated Learning Toward Imbalanced Medical Data.

Multi-domain medical image translation generation for lung image classification based on generative adversarial networks

LDADN: a local discriminant auxiliary disentangled network for key-region-guided chest X-ray image synthesis augmented in pneumoconiosis detection.

Classification of imbalanced medical data: An empirical study of machine learning approaches

Tversky Similarity based UnderSampling with Gaussian Kernelized Decision Stump Adaboost Algorithm for Imbalanced Medical Data Classification

A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Imbalanced Medical Data Research Articles

Related Topics

Articles published on Imbalanced Medical Data

A Comprehensive Analysis of a Framework for Rebalancing Imbalanced Medical Data Using an Ensemble-based Classifier

Effect of Random Under sampling, Oversampling, and SMOTE on the Performance of Cardiovascular Disease Prediction Models

Processing imbalanced medical data at the data level with assisted-reproduction data as an example

A hybrid feature weighting and selection-based strategy to classify the high-dimensional and imbalanced medical data

Cost-sensitive learning for imbalanced medical data: a review

Improving Classification Performance on Imbalanced Medical Data using Generative Adversarial Network

A Hybrid Deep Learning Approach for Epileptic Seizure Detection in EEG signals.

The Performance Comparison between C4.5 Tree and One-Dimensional Convolutional Neural Networks (CNN1D) with Tuning Hyperparameters for the Classification of Imbalanced Medical Data

CHARACTERIZATION OF MORTALITY PREDICTION: AN ENSEMBLE LEARNING ANALYSIS USING THE MIMIC-III DATASET

Imbalanced Multiclass Medical Data Classification based on Learning Automata and Neural Network

An Oversampling Algorithm combining SMOTE and RF for Imbalanced Medical Data

Two Directions for Clinical Data Generation with Large Language Models: Data-to-Label and Label-to-Data.

Flexible cloglog links for binomial regression models as an alternative for imbalanced medical data.

BiLSTM deep neural network model for imbalanced medical data of IoT systems

Personalized Retrogress-Resilient Federated Learning Toward Imbalanced Medical Data.

Multi-domain medical image translation generation for lung image classification based on generative adversarial networks

LDADN: a local discriminant auxiliary disentangled network for key-region-guided chest X-ray image synthesis augmented in pneumoconiosis detection.

Classification of imbalanced medical data: An empirical study of machine learning approaches

Tversky Similarity based UnderSampling with Gaussian Kernelized Decision Stump Adaboost Algorithm for Imbalanced Medical Data Classification

A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data