Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification

Mingzhu Tang,Chunhua Yang,Kang Zhang,Qiyue Xie

doi:10.1155/2014/416591

Abstract

Cost-sensitive support vector machine is one of the most popular tools to deal with class-imbalanced problem such as fault diagnosis. However, such data appear with a huge number of examples as well as features. Aiming at class-imbalanced problem on big data, a cost-sensitive support vector machine using randomized dual coordinate descent method (CSVM-RDCD) is proposed in this paper. The solution of concerned subproblem at each iteration is derived in closed form and the computational cost is decreased through the accelerating strategy and cheap computation. The four constrained conditions of CSVM-RDCD are derived. Experimental results illustrate that the proposed method increases recognition rates of positive class and reduces average misclassification costs on real big class-imbalanced data.

Highlights

The most popular strategy for the design of classification algorithms is to minimize the probability of error, assuming that all misclassifications have the same cost and classes of dataset are balanced [1,2,3,4,5,6]
Large-scale experimental data sets show that cost-sensitive support vector machines using randomized dual coordinate descent method run more efficiently than both parallel cost-sensitive support vector machine (PCSVM) and CSSVM; especially randomized dual coordinate descent algorithm has advantage of training time on large-scale data sets
Randomized dual coordinate descentmethod (RDCD) is the optimization algorithm to update the global solution which is obtained by solving an analytical solution of the suboptimal problem

Summary

Introduction

The most popular strategy for the design of classification algorithms is to minimize the probability of error, assuming that all misclassifications have the same cost and classes of dataset are balanced [1,2,3,4,5,6]. Cost-sensitive support vector machine (CSVM) [2] is one of the most popular tools to deal with class-imbalanced problem and unequal misclassification problem. CSVM usually maps training vectors into a high dimensional space via a nonlinear function. Data appear in a rich dimensional feature space; the performances are similar with/without nonlinear mapping. Dual coordinate descent methods for dual problem of CSVM are one of popular methods to deal with large-scale convex optimization problem. They do not focus on big data learning of CSVM.

Basic Theory of Cost-Sensitive Support Vector Machine

The Modified Proposed Method

Description of Cost-Sensitive Support Vector Machine

Experiments and Analysis

Conclusions

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Abstract and Applied Analysis	Publication Date: Jan 1, 2014
Citations: 15	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Abstract and Applied Analysis

Lead the way for us

Similar Papers

Incremental Cost-Sensitive Support Vector Machine With Linear-Exponential Loss
Yue Ma ... Kun Zhao
IEEE Access | VOL. 8
Yue Ma, et. al.Yue Ma ... Kun Zhao
01 Jan 2020
IEEE Access | VOL. 8

A loan default discrimination model using cost-sensitive support vector machine improved by PSO
Jie Cao ... Hongke Lu
Information Technology and Management | VOL. 14
Jie Cao, et. al.Jie Cao ... Hongke Lu
17 Apr 2013
Information Technology and Management | VOL. 14

Cost Sensitive SVM with Non-informative Examples Elimination for Imbalanced Postoperative Risk Management Problem
Maciej Zięba ... Jerzy Świątek
-
Maciej Zięba, et. al.Maciej Zięba ... Jerzy Świątek
01 Jan 2014
01 Jan 2014

Data-based process fault detection using Active Cost-sensitive Learning
Mingzhu Tang ... Weihua Gui
-
Mingzhu Tang, et. al. Mingzhu Tang ... Weihua Gui
01 Dec 2010
01 Dec 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Abstract and Applied Analysis