A weighting scheme based on emerging patterns for weighted support vector machines

Hongjian Fan Hongjian Fan,K Ramamohanarao

doi:10.1109/grc.2005.1547329

Abstract

Support vector machines (SVMs) are powerful tools for solving classification problems and have been applied to many application fields, such as pattern recognition and data mining, in the past decade. Weighted support vector machines (weighted SVMs) extend SVMs by considering that different input vectors make different contributions to the learning of decision surface. An important issue in training weighted SVMs is how to develop a reliable weighting model to reflect the true noise distribution in the training data, i.e., noise and outliers should have low weights. In this paper, we propose to use emerging patterns (EPs) to construct such a model. EPs are those itemsets whose supports in one class are significantly higher than their supports in the other class. Since EPs of a given class represent the discriminating knowledge unique to their home class, noise and outliers should contain no EPs or EPs of the both contradicting classes, while a representative instance of the class should contain strong EPs of the same class. We calculate numeric scores for each instance based on EPs, and then assign weights to the training data using those scores. An extensive experiment carried out on a large number of benchmark datasets show that our weighting scheme often improves the performance of weighted SVMs over SVMs. We argue that the improvement is due to the ability of our model to approximate the true distribution of data points.

Full Text