Abstract

‎Outlier detection is a technique for recognizing samples out of the main population within a data set‎. ‎Outliers have negative impacts on classification‎. ‎The recognized outliers are deleted to improve the classification power generally‎. ‎This paper proposes a method for outlier detection in test samples besides a supervised training set selection‎. ‎Training set selection is done based on the intersection of three well known similarity measures namely‎, ‎jacquard‎, ‎cosine‎, ‎and dice‎. ‎Each test sample is evaluated against the selected training set for possible outlier detection‎. ‎The selected training set is used for a two-stage classification‎. ‎The accuracy of classifiers are increased after outlier deletion‎. ‎The majority voting function is used for further improvement of classifiers‎.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call