Abstract

This paper explores a new ensemble approach called Ensemble Probability Distribution Novelty Detection (EPDND) for novelty detection. The proposed ensemble approach provides a metric to characterize different classes. Experimental results on 4 real-world datasets show that EPDND exhibits competitive overall performance to the other two common novelty detection approaches - Support Vector Domain Description and Gaussian Mixed Models in terms of accuracy, recall and F1 scores in many cases.

Highlights

  • One of the basic assumptions in most supervised machine learning algorithms is that the class label set is predefined and shared by the training and testing sets so that the classification model could have a good generalization capability

  • From tables, the best results often achieve when using EPND, there are some cases that EPND is on equal terms with Gaussian mixed model (GMM), and optimal results are rarely achieved by Support vector data description (SVDD)

  • Ensemble Probability Distribution Novelty Detection (EPDND) has visible advantages to GMM and SVDD in term of F1 scores while the Class1 is used as new class, though it is not always the best detector

Read more

Summary

Introduction

One of the basic assumptions in most supervised machine learning algorithms is that the class label set is predefined and shared by the training and testing sets so that the classification model could have a good generalization capability. In online webpage classification, we can list out some common classes, such as entertainment, politics, and sports. The classification performance will be degraded sharply while the new classes that are never defined in the training phase emerge in the testing phase. Novelty detection is defined as the task of recognising that test data differ in some respects from the data that are used in training stage [1][2]. We propose an efficient ensemble framework to detect novelty and present a specialization of the framework involving 5 individual classifiers.

Related work
Confidence distribution
Ensemble probability distribution novelty detection approach
Return Result vector result
Preliminaries
Experiments
The result analysis
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call