Handling Class Imbalance Research Articles

ABSTRACTClass imbalance is a critical challenge in big data analytics, often leading to biased predictive models. This imbalance can lead to biased models that perform well on the majority class but poorly on the minority class. Many machine learning models tend to be biased towards the majority class because they aim to minimise overall error, often leading to poor performance on the minority class. This paper presents the balanced ensemble neural network, a novel solution to effectively address class imbalance in big data. Balanced ensemble neural network combines the robust capabilities of neural networks with the power of ensemble learning, incorporating class balancing strategies to ensure fair representation of minority classes. The methodology involves integrating multiple neural networks, each trained on balanced subsets of data using techniques like Synthetic Minority Over‐sampling Technique and Random Undersampling. This integration aims to leverage the strengths of individual networks while reducing their inherent biases. Our extensive experiments across various datasets reveal that BENN achieves an AUC‐ROC score of 0.94, surpassing other models such as random forest (0.88), support vector (0.84) and single neural net (0.80). It was also observed that BENN's performance is better compared to traditional neural network models and standard ensemble methods in key metrics like accuracy, precision, recall, F1‐score and AUC‐ROC. The results specifically highlight BENN's effectiveness in accurately classifying instances of minority classes, a notable challenge in many existing models. These findings underscore BENN's potential as a substantial advancement in handling class imbalance within big data environments, offering a promising direction for future research and application in machine learning.

Read full abstract

Cyberattack classification involves applying deep learning (DL) and machine learning (ML) models to categorize digital threats based on their features and behaviors. These models examine system logs, network traffic, or other associated data patterns to discriminate between standard activities and malicious actions. Efficient cyberattack classification is vital for on-time threat detection and response, permitting cybersecurity specialists to categorize and reduce potential risks to a system. Handling class-imbalanced data in cyberattack classification using DL is critical for achieving exact and robust models. In cybersecurity databases, instances of normal behavior frequently significantly outnumber instances of cyberattacks, foremost due to biased methods that may complete poorly on minority classes. To address this issue approaches such as oversampling the lesser class, undersampling the popular class, or using more advanced systems can be used. These plans defend that the DL technique is more complex when determining cyberattacks, so it increases complete performance and adapts the effect of the imbalance class on the classification results. This study presents a novel Hybrid Salp Swarm Algorithm with a DL Approach for Cyberattack Classification (HSSADL-CAC) technique. The HSSADL-CAC method intends to resolve class imbalance data handling with an optimum DL model for the recognition of cyberattacks. At first, the HSSADL-CAC method experiences data normalization as a pre-processing stage. The HSSADL-CAC technique uses the ADASYN approach to handle class imbalance problems. In addition, the HSSADL-CAC technique applies an HSSA-based feature selection approach. The HSSADL-CAC technique detects cyberattacks using a deep extreme learning machine (DELM) model. Finally, the hyperparameter tuning of the ELM model takes place by utilizing the beluga whale optimization (BWO) model. The performance analysis of the HSSADL-CAC technique employs a benchmark database. The comprehensive comparison research indicates the superior performance of the HSSADL-CAC technique in the cyberattack detection procedure.

Read full abstract

Handling Class Imbalance Research Articles

Related Topics

Articles published on Handling Class Imbalance

Research on Land Use and Land Cover Information Extraction Methods for Remote Sensing Images Based on Improved Convolutional Neural Networks

Public Sentiment Analysis About Neuralink from Twitter Using Naïve Bayes: Multinomial, Gaussian and Complement

Prediction of Stock Industry Sectors Listed on the Indonesia Stock Exchange (IDX) based on Financial Statements with the Random Forest Method

Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing

BENN: Balanced Ensemble Neural Network for Handling Class Imbalance in Big Data

Ensemble of vision transformer architectures for efficient Alzheimer’s Disease classification

HIDIM: A novel framework of network intrusion detection for hierarchical dependency and class imbalance

Breast cancer prognosis through the use of multi-modal classifiers: current state of the art and the way forward.

Effect of Random Under sampling, Oversampling, and SMOTE on the Performance of Cardiovascular Disease Prediction Models

StAlK: Structural Alignment based Self Knowledge distillation for Medical Image Classification

An efficient diagnosis of heart disease using optimized cross-layer Densenet121 pyramid mutual attention network

Class imbalanced data handling with cyberattack classification using Hybrid Salp Swarm Algorithm with deep learning approach

Early warning systems for financial distress: A machine learning approach to corporate risk mitigation

Enhancing Machine Learning Model Performance in Addressing Class Imbalance

DBOS_US: a density-based graph under-sampling method to handle class imbalance and class overlap issues in software fault prediction

Handling Class Imbalance for Indonesian Twitter Sentiment Analysis A Comparative Study of Algorithms

Traffic Sign Recognition using HIICNN Stack Ensemble Method

Systematic Review of Models Usedto Handle Class Imbalance in Anomaly Detection for Energy Consumption

CIIR: an approach to handle class imbalance using a novel feature selection technique

Enhanced Classification of Imbalanced Medical Datasets using Hybrid Data-Level, Cost-Sensitive and Ensemble Methods

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Handling Class Imbalance Research Articles

Related Topics

Articles published on Handling Class Imbalance

Research on Land Use and Land Cover Information Extraction Methods for Remote Sensing Images Based on Improved Convolutional Neural Networks

Public Sentiment Analysis About Neuralink from Twitter Using Naïve Bayes: Multinomial, Gaussian and Complement

Prediction of Stock Industry Sectors Listed on the Indonesia Stock Exchange (IDX) based on Financial Statements with the Random Forest Method

Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing

BENN: Balanced Ensemble Neural Network for Handling Class Imbalance in Big Data

Ensemble of vision transformer architectures for efficient Alzheimer’s Disease classification

HIDIM: A novel framework of network intrusion detection for hierarchical dependency and class imbalance

Breast cancer prognosis through the use of multi-modal classifiers: current state of the art and the way forward.

Effect of Random Under sampling, Oversampling, and SMOTE on the Performance of Cardiovascular Disease Prediction Models

StAlK: Structural Alignment based Self Knowledge distillation for Medical Image Classification

An efficient diagnosis of heart disease using optimized cross-layer Densenet121 pyramid mutual attention network

Class imbalanced data handling with cyberattack classification using Hybrid Salp Swarm Algorithm with deep learning approach

Early warning systems for financial distress: A machine learning approach to corporate risk mitigation

Enhancing Machine Learning Model Performance in Addressing Class Imbalance

DBOS_US: a density-based graph under-sampling method to handle class imbalance and class overlap issues in software fault prediction

Handling Class Imbalance for Indonesian Twitter Sentiment Analysis A Comparative Study of Algorithms

Traffic Sign Recognition using HIICNN Stack Ensemble Method

Systematic Review of Models Usedto Handle Class Imbalance in Anomaly Detection for Energy Consumption

CIIR: an approach to handle class imbalance using a novel feature selection technique

Enhanced Classification of Imbalanced Medical Datasets using Hybrid Data-Level, Cost-Sensitive and Ensemble Methods