Stochastic approximation algorithms for support vector machines semi-supervised binary classification

Vaida Bartkutė-Norkūnienė

doi:10.15388/lmr.2008.18114

Abstract

In this paper,we consider the problem of semi-supervisedbinary classificationby Support Vector Machines (SVM). This problem is explored as an unconstrained and non-smooth optimization task when part of the available data is unlabelled. We apply non-smooth optimization techniques to classification problems where the objective function considered is non-convex and non-differentiable and so difficult to minimize. We explore and compare the properties of Stochastic Approximation algorithms (Simultaneous Perturbation Stochastic Approximation (SPSA) with the Lipschitz Perturbation Operator, SPSA with the Uniform Perturbation Operator, and Standard Finite Difference Approximation) for semi-supervised SVM classification. We present some numerical results obtained by running the proposed methods on several standard test problems drawn from the binary classification literature.

Highlights

Support Vector Machines (SVMs) are well-known data mining methods for classification, regression and time serries analysis problems
When data points consist of two sets exactly: one set that has been labelled by a decision maker and the other that is not classified, but belongs to one known category we have a traditional semi-supervised classification problem
(2) several standard examples drawn from the binary classification literature where chosen

Summary

Introduction

Support Vector Machines (SVMs) are well-known data mining methods for classification, regression and time serries analysis problems. In the standard binary classification problem, a set of training data (ui , yi), . (um, ym) is analysed, where the input set of points is ui ∈ U ⊂ n, the yi is either +1 or −1, indicating the class to which the point ui belongs, yi ∈ {+1, −1}. The main idea of SVM classification is to find a maximal margin separating hyperplane between classes [4]. For a linearly separable case, the support vector algorithm looks for the separating hyperplane with the largest margin. Where w is the normal vector of a separating hyperplane. The goal of classification is to maximize the margin width. We can formulate our problem as a standard quadratic programming problem [4, 5]:

Bartkute-Norkuniene

Stochastic approximation techniques

Experimental results

Conclusions