Imbalanced Distribution Research Articles

Background: In the U.S., about 8% of adults never received cholesterol screening. Although machine learning (ML) has been used to develop decision tools for Atherosclerotic Cardiovascular Disease (ASCVD) risk prediction, its application in behavioral forecasting has not yet been explored in the context of cholesterol screening behaviors. This study aimed to examine the performance and accuracy of ML algorithms in forecasting cholesterol screening behaviors in adults after age 50. Methods: This analysis used deidentified data from the Health and Retirement Study (HRS) 2004-2018. HRS is a longitudinal survey among 23,000 households in the U.S. Participants were excluded from the current analysis if they passed away by 2019, ever had ASCVD or stroke, were under age 50 at baseline, or had missing data in self-reported cholesterol screening. In total, 7176 participants (mean age [SD]=62 [8]) met the inclusion criteria; participants were randomly split into a training set (80%) and a testing set (20%). The synthetic minority oversampling technique was used to solve the imbalance distribution of the rare event. Five ML algorithms were used: random forest, gradient boosting machine (GBM), XGBoost, Support Vector Machine (SVM), and logistic regression. Accuracy, AUROC, and positive predictive value (PPV) were used to compare model performance. The average gain was evaluated for feature importance in the demographic and health domains. Results: In total, 232 (3.2%) respondents did not receive any cholesterol screening from 2008 to 2018. Experiments with five ML algorithms suggested that XGBoost with deeper trees and learning rate performed better in classifying those who did not screen for cholesterol levels over 10 years. Adding prior cholesterol screening history (2004-2006) into the model significantly improved model performance. Hypertension, self-rated health, and smoking were the major health features, while insurance, poverty, and work status were the major demographic features in the predictive model (accuracy=0.97; AUROC=0.88; PPV=0.42). Conclusion: Findings underscore the potential utility of ML models in predicting cholesterol screening behaviors after age 50. This could be the basis for developing decision tools for clinicians to identify those with a lower chance of cholesterol screening or make reminders accordingly. The low-cost predictive model might improve the uptake of preventive screening behaviors in middle-aged and older adults.

Read full abstract

Field-road trajectory classification is a crucial task for agricultural machinery behavior mode recognition, aiming to distinguish field operation mode and road driving mode automatically. However, the imbalanced distribution of agricultural machine trajectories brings challenges for the field-road trajectory classification task. Additionally, most existing field-road trajectory classification methods have certain shortcomings. For instance, they encounter difficulties in accurately representing the state of agricultural machinery movement using the current features. The data transformation process often leads to information loss, and the model’s generalization capabilities are limited. The performance of the models is constrained by each of these elements. To address these shortcomings, this paper introduces a general image classification model for agricultural machinery trajectory mode recognition named ATRNet. First, to address the issue of imbalanced field-road proportions in agricultural machinery trajectory data, a Conditional Tabular Generative Adversarial Network (CTGAN) is employed to generate quasi trajectories, balancing the distribution of positive and negative samples in the data. This step aims to eliminate biases during the model training process. Second, to accurately characterize the motion status of agricultural machinery, we propose a multiangle feature enhancement method to extract rich spatiotemporal features from trajectory data. Finally, different from conventional field-road trajectory classification models that primarily rely on spatial and temporal information for identifying trajectories, we present a lossless trajectory data representation paradigm. This paradigm maps each trajectory point into a “feature map” and uses an image classification model to capture latent feature representations of trajectory points for the recognition of different behavior modes of agricultural machinery. This paradigm can generalize image classification networks to the field-road trajectory classification task, providing a general vision model solution for agricultural machinery trajectory mode recognition. To validate the effectiveness of the ATRNet model, experiments were conducted on real corn and wheat harvester trajectory datasets. The results demonstrate that the proposed model achieves remarkable performance improvements over the state-of-the-art (SOTA) models. In the corn harvester trajectory dataset, ATRNet achieves an accuracy of 92.36% and an F1-score of 92.34%, surpassing existing SOTA models by 3.12% and 12.46%, respectively. Similarly, in the wheat harvester trajectory dataset, ATRNet achieves an accuracy of 92.36% and an F1-score of 92.33%, outperforming the existing optimal algorithm by 4.76% and 18.18%, respectively.

Read full abstract

Imbalanced Distribution Research Articles

Related Topics

Articles published on Imbalanced Distribution

Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels.

Triplet Adaptation Framework for Robust Semi-Supervised Learning.

Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation.

Many-Objective Jaccard-Based Evolutionary Feature Selection for High-Dimensional Imbalanced Data Classification.

TC-Sniffer: A Transformer-CNN Bibranch Framework Leveraging Auxiliary VOCs for Few-Shot UBC Diagnosis via Electronic Noses.

Gyroscope in-assembly drift anomaly detection based on decision re-optimized deep auto-encoder

Abstract 4139194: Predicting Cholesterol Screening Behavior After Age 50 Using Machine Learning: Insights from the Health and Retirement Study

A Comprehensive Survey on Rare Event Prediction

Multi‐Objective Federated Averaging Algorithm

Multi-classification prediction of PM2.5 concentration based on improved adaptive boosting rotation forest

A general image classification model for agricultural machinery trajectory mode recognition

Separating Noisy Samples From Tail Classes for Long-Tailed Image Classification With Label Noise.

Privacy-Preserving Incipient Fault Identification in Distribution Networks Under Small Sample and Imbalanced Data Distribution Conditions

A Study on the Impact of Overtaking Lane-Changing Behaviour in Expressway Interchange Weaving Areas

Monocular Depth and Ego-motion Estimation with Scale Based on Superpixel and Normal Constraints

Learnable feature alignment with attention-based data augmentation for handling data issue in ancient documents

Comparison of Machine Learning Models and Feature Importance Investigation of Intelligent Fault Diagnosis Methods for Robots Based on Datasets Across Various Distributions

Improving long‐tail classification via decoupling and regularisation

Robust Visual Question Answering utilizing Bias Instances and Label Imbalance

A Semantically Enhanced Label Prediction Method for Imbalanced POI Data Category Distribution

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Imbalanced Distribution Research Articles

Related Topics

Articles published on Imbalanced Distribution

Large-Scale Object Detection in the Wild With Imbalanced Data Distribution, and Multi-Labels.

Triplet Adaptation Framework for Robust Semi-Supervised Learning.

Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation.

Many-Objective Jaccard-Based Evolutionary Feature Selection for High-Dimensional Imbalanced Data Classification.

TC-Sniffer: A Transformer-CNN Bibranch Framework Leveraging Auxiliary VOCs for Few-Shot UBC Diagnosis via Electronic Noses.

Gyroscope in-assembly drift anomaly detection based on decision re-optimized deep auto-encoder

Abstract 4139194: Predicting Cholesterol Screening Behavior After Age 50 Using Machine Learning: Insights from the Health and Retirement Study

A Comprehensive Survey on Rare Event Prediction

Multi‐Objective Federated Averaging Algorithm

Multi-classification prediction of PM2.5 concentration based on improved adaptive boosting rotation forest

A general image classification model for agricultural machinery trajectory mode recognition

Separating Noisy Samples From Tail Classes for Long-Tailed Image Classification With Label Noise.

Privacy-Preserving Incipient Fault Identification in Distribution Networks Under Small Sample and Imbalanced Data Distribution Conditions

A Study on the Impact of Overtaking Lane-Changing Behaviour in Expressway Interchange Weaving Areas

Monocular Depth and Ego-motion Estimation with Scale Based on Superpixel and Normal Constraints

Learnable feature alignment with attention-based data augmentation for handling data issue in ancient documents

Comparison of Machine Learning Models and Feature Importance Investigation of Intelligent Fault Diagnosis Methods for Robots Based on Datasets Across Various Distributions

Improving long‐tail classification via decoupling and regularisation

Robust Visual Question Answering utilizing Bias Instances and Label Imbalance

A Semantically Enhanced Label Prediction Method for Imbalanced POI Data Category Distribution