A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets

Jinyan Li,Simon Fong,Raymond K Wong,Sabah Mohammed,Jinan Fiaidhi,Yunsick Sung

doi:10.1016/j.asoc.2017.11.028

Abstract

Imbalanced datasets can be found in a number of fields; they are commonly regarded as big data because of their sheer volume and high attribute dimensions. As the name suggests, imbalanced big datasets come with an extremely imbalanced ratio between the amount of major class and minority class samples. Traditionalmethods: have been attempted but still cannot fully, effectively, and reliably solve the imbalanced class classification problem, especially when the distribution of the classes is exceedingly imbalanced. In this paper, we propose a collection of algorithms to solve the problem of imbalanced datasets in binary data classification. Most traditional methods: rebalance the imbalanced dataset merely by matching the data quantities of the two classes. Our proposed algorithms, which take the form of a suite of variants, focus on guaranteeing the credibility of the classification model and reaching the greatest possible accuracy by dynamically rebalancing the training dataset with multi-objective swarm intelligence optimisation. The new algorithms are extended from those we proposed earlier, which had a single objective – first find a set of solutions that satisfy the Kappa criterion, then search for the solution in the set that offers the highest accuracy. Two main modifications are made in the new algorithms. Multi-objective optimisation is aimed at finding a solution that satisfies several criteria at the same time, such as accuracy and identifying a list of credibility indicators. The other enhancement is the incremental operation of the multi-objective optimisation. Incremental optimisation is imperative for processing data feeds that may arrive in a streaming manner. Instead of waiting for the full data archive to be available before optimisation, incremental optimisation rebalances the data feed segment by segment on the fly. The experimental results from the suite of proposed algorithms show that they can effectively attain better and more stable performances from the classification model and are accompanied by much greater credibility than the other five traditional methods when imbalanced datasets are used as training datasets for inducing a classifier.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Journal: Applied Soft Computing	Publication Date: Nov 23, 2017
Citations: 6

Similar Papers

Imbalance Learning and Its Application on Medical Datasets
Yachao Shao
-
Yachao ShaoYachao Shao
21 Feb 2022
21 Feb 2022

An Improved Oversampling Method Based on Neighborhood Kernel Density Estimation for Imbalanced Emotion Dataset
Gague Kim ... Hyuntae Jeong
-
Gague Kim, et. al.Gague Kim ... Hyuntae Jeong
01 Jan 2020
01 Jan 2020

Comparing the classification performances of supervised classifiers with balanced and imbalanced SAR data sets
Mustafa Üstüner ... Füsun Balık Şanlı
-
Mustafa Üstüner, et. al.Mustafa Üstüner ... Füsun Balık Şanlı
01 May 2018
01 May 2018

RN-Autoencoder: Reduced Noise Autoencoder for classifying imbalanced cancer genomic data
Ahmed Arafa ... Mohammed Badawy
Journal of Biological Engineering | VOL. 17
Ahmed Arafa, et. al.Ahmed Arafa ... Mohammed Badawy
30 Jan 2023
Journal of Biological Engineering | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A suite of swarm dynamic multi-objective algorithms for rebalancing extremely imbalanced datasets

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing