Abstract

Outlier detection is crucial for improving the performance of machine learning algorithms and is particularly vital in data sets possessing a small number of points. While the existing outlier detection methods deliver good results on a certain data set, the results are rather down on some data sets. Besides all these aspects, there is also a need for an algorithm that quickly processes high-dimensional data sets. To satisfy these requirements, we propose an unsupervised local outlier detection method that can draw the neighborhood boundaries of the data points via Chebyshev inequality. The proposed method sets the boundaries of the points through the so-called deviation parameter that correlates to the standard deviation of the data distribution and then detects outliers by quantifying their neighborhood densities. The experimental results on real-world and synthetic data sets show the efficacy of the proposed method in comparison to the state-of-the-art methods. The source code of the proposed algorithm and the data sets are at https://github.com/fatihaydin1/BLDOD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call