Instance reduction for one-class classification

Bartosz Krawczyk,Salvador García,Michał Woźniak,Francisco Herrera,Isaac Triguero

doi:10.1007/s10115-018-1220-z

Abstract

Instance reduction techniques are data preprocessing methods originally developed to enhance the nearest neighbor rule for standard classification. They reduce the training data by selecting or generating representative examples of a given problem. These algorithms have been designed and widely analyzed in multi-class problems providing very competitive results. However, this issue was rarely addressed in the context of one-class classification. In this specific domain a reduction of the training set may not only decrease the classification time and classifier’s complexity, but also allows us to handle internal noisy data and simplify the data description boundary. We propose two methods for achieving this goal. The first one is a flexible framework that adjusts any instance reduction method to one-class scenario by introduction of meaningful artificial outliers. The second one is a novel modification of evolutionary instance reduction technique that is based on differential evolution and uses consistency measure for model evaluation in filter or wrapper modes. It is a powerful native one-class solution that does not require an access to counterexamples. Both of the proposed algorithms can be applied to any type of one-class classifier. On the basis of extensive computational experiments, we show that the proposed methods are highly efficient techniques to reduce the complexity and improve the classification performance in one-class scenarios.

Highlights

Data preprocessing is an essential step within the machine learning process [41,21,42]
To adapt Scale Factor Local Search in Differential Evolution (SFLSDE) algorithm to one-class classification (OCC) nature we propose to augment it with optimization criterion using the consistency metric
For some of datasets, the standard instance reduction (InR) techniques returned too small training set to build a one-class support vector classifier. This is because they use the nearest neighbor approach which has no lower bound on the size of the training set, while methods based on support vectors require a certain amount of samples for processing

Summary

Introduction

Data preprocessing is an essential step within the machine learning process [41,21,42]. Generating artificial counterexamples have been used so far in the process of training one-class classifiers [26], but not during the one-class preprocessing phase This approach could be viewed as a datalevel solution, as we modify our training data to allow unaltered usage of any InR algorithm from the literature. We present a family of data-level and algorithm-level InR methods for OCC and validate their usefulness and impact on training set reduction, classification accuracy and recognition time on the basis of thorough computational experiments. Such a comparison allows us to gain an insight into how we can reduce the size of the training set in the absence of counterexamples, while maintaining or even improving the obtained predictive performance. 2 Related Works This section provides the necessary background for the remainder of the paper

One-Class Classification

Instance Reduction in Standard Classification

The Role of Instance Reduction in One-Class Classification

Applying Instance Reduction to One-Class Classification

Adapting Existing Instance Reduction Methods to One-Class Classification

Evolutionary Filter and Wrapper Methods for One-Class Instance Reduction

Scale Factor Local Search in Differential Evolution for Instance reduction

Adapting SFLSDE to OCC

Datasets

Methods

20. Census-Income

Set-up

General Comments on Obtained Results

Results for One-Class Nearest Neighbor

Results for Minimum Spanning Tree Data Description

Results for Support Vector Data Description

Impact on the Computational Complexity

Conclusions and Future Works

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Knowledge and Information Systems	Publication Date: May 21, 2018
Citations: 28	License type: other-oa

R Discovery Prime

R Discovery Prime

Instance reduction for one-class classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Knowledge and Information Systems

Lead the way for us

Similar Papers

A first attempt on evolutionary prototype reduction for nearest neighbor one-class classification
Bartosz Krawczyk ... Isaac Triguero
-
Bartosz Krawczyk, et. al.Bartosz Krawczyk ... Isaac Triguero
01 Jul 2014
01 Jul 2014

In Defense of Online Kmeans for Prototype Generation and Instance Reduction
Mauricio García-Limón ... Hugo Jair Escalante
-
Mauricio García-Limón, et. al.Mauricio García-Limón ... Hugo Jair Escalante
01 Jan 2015
01 Jan 2015

Constraint nearest neighbor for instance reduction
Lijun Yang ... Xiaolu Hong
Soft Computing | VOL. 23
Lijun Yang, et. al.Lijun Yang ... Xiaolu Hong
22 Mar 2019
Soft Computing | VOL. 23

ELS: A Fast Parameter-Free Edition Algorithm With Natural Neighbors-Based Local Sets for k Nearest Neighbor
Suwen Zhao ... Junnan Li
IEEE Access | VOL. 8
Suwen Zhao, et. al.Suwen Zhao ... Junnan Li
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Instance reduction for one-class classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Knowledge and Information Systems