The Need for Unsupervised Outlier Model Selection: A Review and Evaluation of Internal Evaluation Strategies

Martin Q Ma,Yue Zhao,Leman Akoglu,Xiaorong Zhang

doi:10.1145/3606274.3606277

Abstract

Given an unsupervised outlier detection task, how should one select i) a detection algorithm, and ii) associated hyperparameter values (jointly called a model)? E ective outlier model selection is essential as di erent algorithms may work well for varying detection tasks, and moreover their performance can be quite sensitive to the values of the hyperparameters (HPs). On the other hand, unsupervised model selection is notoriously difficult, in the absence of hold-out validation data with ground-truth labels. Therefore, the problem is vastly understudied in the outlier mining literature. There exists a body of work that propose internal model evaluation strate- gies for selecting a model. These so-called internal strategies solely rely on the input data (without labels) and the output (outlier scores) of the candidate models. In this paper, we rst survey internal model evaluation strategies including both those proposed speci cally for outlier detection, as well as those that can be adapted from the unsupervised deep representation learning literature. Then, we investigate their e ectiveness empirically in comparison to simple baselines such as random selection and the popular state-of-the-art detector Isolation Forest (iForest) with default HPs. To this end, we set up (and open-source) a large testbed with 39 detection tasks and 297 candidate models comprised of 8 different detectors and various HP con gurations. We evaluate internal strategies from 7 di erent families on their ability to discriminate between models w.r.t. detection performance, without using any labels. Our study reports a striking nding, that none of the existing and adapted strategies would be practically useful: stand-alone ones are not signi cantly di erent from random, and consensus-based ones do not outperform iForest (w/ default HPs) while being more expensive (as all candidate models need to be trained for evaluation). Our survey stresses the importance of and the standing need for e ective unsupervised outlier model selection, and acts as a call for future work on the problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The Need for Unsupervised Outlier Model Selection: A Review and Evaluation of Internal Evaluation Strategies

Abstract

Talk to us

Similar Papers

More From: ACM SIGKDD Explorations Newsletter

Lead the way for us

Journal: ACM SIGKDD Explorations Newsletter	Publication Date: Jun 22, 2023
Citations: 7

Similar Papers

Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures
Philipp Röchner ... Henrique O Marques
Data Mining and Knowledge Discovery | VOL. 38
Philipp Röchner, et. al.Philipp Röchner ... Henrique O Marques
19 Jul 2024
Data Mining and Knowledge Discovery | VOL. 38

Robust outlier detection based on the changing rate of directed density ratio
Kangsheng Li ... Zijian Huang
Expert Systems with Applications | VOL. 207
Kangsheng Li, et. al.Kangsheng Li ... Zijian Huang
30 Jun 2022
Expert Systems with Applications | VOL. 207

Enhancing Unsupervised Outlier Model Selection: A Study on IREOS Algorithms
Philipp Schlieper ... Hermann Luft
ACM Transactions on Knowledge Discovery from Data | VOL. 18
Philipp Schlieper, et. al.Philipp Schlieper ... Hermann Luft
19 Jun 2024
ACM Transactions on Knowledge Discovery from Data | VOL. 18

Intercity Rail Transit Platform Anomaly Detection Using Door Tracking-Based Key Frame Extraction and AnoDet Network
Ruikang Liu ... Mengfei Duan
IEEE Transactions on Instrumentation and Measurement | VOL. 72
Ruikang Liu, et. al.Ruikang Liu ... Mengfei Duan
01 Jan 2023
IEEE Transactions on Instrumentation and Measurement | VOL. 72

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Need for Unsupervised Outlier Model Selection: A Review and Evaluation of Internal Evaluation Strategies

Abstract

Talk to us

Similar Papers

More From: ACM SIGKDD Explorations Newsletter