Imbalanced Data SVM Classification Method Based on Cluster Boundary Sampling and DT-KNN Pruning

Peng Li,Xiao-Yang Yu,Ting-Ting Bi,Jiu-Ling Huang

doi:10.14257/ijsip.2014.7.2.06

Abstract

This paper presents a SVM classification method based on cluster boundary sampling and sample pruning. We actively explore an effective solution to solve the difficult problem of imbalanced data set classification from data re-sampling and algorithm improving. Firstly, we creatively propose the method of cluster boundary sampling, using the clustering density threshold and the boundary density threshold to determine the cluster boundaries, in order to guide the process of re-sampling more scientifically and accurately. Secondly, we put forward a new sample pruning algorithm based on dynamic threshold KNN to deal with the complexity and overlapping problem of imbalanced data set. The phenomenon of data complexity and overlapping will reduce the classification performance and generalization ability of SVM classifier. Experiments show that our method acquires obviously promotion effect in various different imbalanced data sets and it can prove the validity and stability.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Imbalanced Data SVM Classification Method Based on Cluster Boundary Sampling and DT-KNN Pruning

Abstract

Talk to us

Similar Papers

More From: International Journal of Signal Processing Image Processing and Pattern Recognition

Lead the way for us

Journal: International Journal of Signal Processing Image Processing and Pattern Recognition	Publication Date: Apr 30, 2014
Citations: 13

Similar Papers

Imbalanced Data Sample Pruning Algorithm Based on Dynamic threshold K Nearest Neighbor
Li Peng ... Ting-Ting Bi
-
Li Peng, et. al.Li Peng ... Ting-Ting Bi
15 Dec 2013
15 Dec 2013

Clustering Based Data Preprocessing Technique to Deal with Imbalanced Dataset Problem in Classification Task
Anil Jadhav
-
Anil JadhavAnil Jadhav
01 Nov 2018
01 Nov 2018

AN OVERVIEW ON DATA MINING DESIGNED FOR IMBALANCED DATASETS
Mohammad Imran
International Journal of Research in Engineering and Technology | VOL. 03
Mohammad Imran Mohammad Imran
25 Oct 2014
International Journal of Research in Engineering and Technology | VOL. 03

Emerging Trends in Classification with Imbalanced Datasets: A Bibliometric Analysis of Progression
Abdullah Maraş ... Çiğdem Erol
Bilişim Teknolojileri Dergisi | VOL. 15
Abdullah Maraş, et. al.Abdullah Maraş ... Çiğdem Erol
31 Jul 2022
Bilişim Teknolojileri Dergisi | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Imbalanced Data SVM Classification Method Based on Cluster Boundary Sampling and DT-KNN Pruning

Abstract

Talk to us

Similar Papers

More From: International Journal of Signal Processing Image Processing and Pattern Recognition