Advanced Data Balancing Method with SVM Decision Boundary and Bagging

Md Yasir Arafat,Shuxiang Xu,Dewan Md Farid,Sabera Hoque

doi:10.1109/csde48274.2019.9162349

Abstract

In the field of data mining applications, the most challenging research work is to classify imbalanced data in supervised learning. From the beginning of the machine learning era, a huge number of researches has done by inventing a lot of successful data balancing methods, however, these are not yet fully compatible to learn from imbalanced data. The key reason is data is producing in every moment and randomly by various sources, like humans, machines, sensors, robots, and so on. To date, the existing machine learning algorithms mostly biased to the majority class instances and neglecting the minority class instances, therefore they have a crucial effect on the prediction capacity and exactness on the final results. To deal with imbalanced data the most popular classifiers are under-sampling and over-sampling, cost-sensitive learning and ensemble learning. In this paper, we have presented a novel data balancing method by employing clustering technique with SVM especially decision boundary instances for creating balanced data. The proposed approach selects the most informative majority class and minority class support vectors nearest to the decision boundary or hyperplane, and finally, combines them to establishes a novel method. We have experimented with the performance of our proposed technique with some popular single classifiers such as C4.5 Decision Tree (DT) and Naive Bayes (NB) classifier and ensemble classifiers such as Random Forest and AdaBoost on thirteen standard real-life imbalanced datasets. The experimental result shows apparently, that the proposed idea generates improved accuracy when classifying both the minority and majority class instances compared to other existing methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Advanced Data Balancing Method with SVM Decision Boundary and Bagging

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An Under-Sampling Method with Support Vectors in Multi-class Imbalanced Data Classification
Md Yasir Arafat ... Shuxiang Xu
-
Md Yasir Arafat, et. al.Md Yasir Arafat ... Shuxiang Xu
01 Aug 2019
01 Aug 2019

A Hybrid Under-Sampling Method (HUSBoost) to Classify Imbalanced Data
Mahmudul Hasan Popel ... Khan Md Hasib
-
Mahmudul Hasan Popel, et. al.Mahmudul Hasan Popel ... Khan Md Hasib
01 Dec 2018
01 Dec 2018

Cluster-based under-sampling with random forest for multi-class imbalanced classification
Md Yasir Arafat ... Dewan Md Farid
-
Md Yasir Arafat, et. al.Md Yasir Arafat ... Dewan Md Farid
01 Dec 2017
01 Dec 2017

Towards Deeper Insights into Deep Learning from Imbalanced Data
Jie Song ... Yun Shen
-
Jie Song, et. al.Jie Song ... Yun Shen
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Advanced Data Balancing Method with SVM Decision Boundary and Bagging

Abstract

Talk to us

Similar Papers