Abstract

Explaining outliers is a topic that attracts a lot of interest; however existing proposals focus on the identification of the relevant dimensions. We extend this rationale for unsupervised distance-based outlier detection, and through investigating subspaces, we propose a novel labeling of outliers in a manner that is intuitive for the user and does not require any training at runtime. Moreover, our solution is applicable to online settings and a complete prototype for detecting and explaining outliers in data streams using massive parallelism has been implemented. Our solution is evaluated in terms of both the quality of the labels derived and the performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call