Classification with Local Clustering in Imbalanced Data Sets

Hua Ji,Hua Xiang Zhang

doi:10.4028/www.scientific.net/amr.219-220.151

Abstract

In many real-world domains, learning from imbalanced data sets is always confronted. Since the skewed class distribution brings the challenge for traditional classifiers because of much lower classification accuracy on rare classes, we propose the novel method on classification with local clustering based on the data distribution of the imbalanced data sets to solve this problem. At first, we divide the whole data set into several data groups based on the data distribution. Then we perform local clustering within each group both on the normal class and the disjointed rare class. For rare class, the subsequent over-sampling is employed according to the different rates. At last, we apply support vector machines (SVMS) for classification, by means of the traditional tactic of the cost matrix to enhance the classification accuracies. The experimental results on several UCI data sets show that this method can produces much higher prediction accuracies on the rare class than state-of-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Classification with Local Clustering in Imbalanced Data Sets

Abstract

Talk to us

Similar Papers

More From: Advanced Materials Research

Lead the way for us

Journal: Advanced Materials Research	Publication Date: Mar 1, 2011
Citations: 2

Similar Papers

Comparing the classification performances of supervised classifiers with balanced and imbalanced SAR data sets
Mustafa Üstüner ... Ünsal Gökdağ
-
Mustafa Üstüner, et. al.Mustafa Üstüner ... Ünsal Gökdağ
01 May 2018
01 May 2018

불균형 데이터 집합의 분류를 위한 하이브리드 SVM 모델
Jae Sik Lee ... Jong Gu Kwon
Journal of Intelligence and Information Systems | VOL. 19
Jae Sik Lee, et. al.Jae Sik Lee ... Jong Gu Kwon
30 Jun 2013
Journal of Intelligence and Information Systems | VOL. 19

Improving SVM classification on imbalanced time series data sets with ghost points
Suzan Köknar-Tezel ... Longin Jan Latecki
Knowledge and Information Systems | VOL. 28
Suzan Köknar-Tezel, et. al.Suzan Köknar-Tezel ... Longin Jan Latecki
16 Jun 2010
Knowledge and Information Systems | VOL. 28

Local decomposition for rare class analysis
Junjie Wu ... Peng Wu
-
Junjie Wu, et. al.Junjie Wu ... Peng Wu
12 Aug 2007
12 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification with Local Clustering in Imbalanced Data Sets

Abstract

Talk to us

Similar Papers

More From: Advanced Materials Research