Abstract

Explaining outliers is a topic that attracts a lot of interest; however existing proposals focus on the identification of the relevant dimensions. We extend this rationale for unsupervised distance-based outlier detection, and through investigating subspaces, we propose a novel labeling of outliers in a manner that is intuitive for the user and does not require any training at runtime. Moreover, our solution is applicable to online settings and a complete prototype for detecting and explaining outliers in data streams using massive parallelism has been implemented. Our solution is evaluated in terms of both the quality of the labels derived and the performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.