Abstract

In this paper a new algorithm, OKC classifier is proposed that is a hybrid of One-Class SVM, k-Nearest Neighbours and CART algorithms. The performance of most of the classification algorithms is significantly influenced by certain characteristics of datasets on which these are modeled such as imbalance in class distribution, class overlapping, lack of density, etc. The proposed algorithm can perform the classification task on imbalanced datasets without re-sampling. This algorithm is compared against a few well known classification algorithms and on datasets having varying degrees of class imbalance and class overlap. The experimental results demonstrate that the proposed algorithm has performed better than a number of standard classification algorithms.

Highlights

  • Classification is a task of categorizing the instances of a specified class from amongst the given set of classes

  • The results of the proposed algorithm are compared with standard machine learning algorithms decision tree, neural network, Support Vector Machine (SVM), Naïve Bayes, k-Nearest Neighbors, Naive Bayes tree and Classification and Regression Tree (CART)

  • A new classification algorithm based on a hybrid combination of one class SVM, k-NN and CART algorithms has been proposed

Read more

Summary

INTRODUCTION

Classification is a task of categorizing the instances of a specified class from amongst the given set of classes This task is done by a classifier that is demonstrated on a dataset of training cases. In many real world domains, like fraud detection, medical diagnosis, etc., the number of examples that belong to one class may severely outnumber the instances that belong to another class/classes Such datasets, in which significant differences in the proportion of cases having a place with various classes are possible, called imbalanced datasets. As the majority class instances are much higher in number than the minority class ones, the classifier would give high accuracy, even if it classifies all instances as majority class and misclassifies all the minority class instances We have proposed a new algorithm, namely, OKC classifier (hybrid of One-class SVM, K-nearest neighbor and CART) to overcome this problem

Imbalanced Datasets
Class Overlapping
Lack of Density
BACKGROUND
Proposed OKC Classifier
One Class SVM
K-Nearest Neighbors
Splitting Criteria for OKC Classifier
Stopping Conditions for OKC Classifier
Algorithm
Experimental Results
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.