Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping

Junli Li,Ning Pang,Jifu Zhang,Xiao Qin

doi:10.1109/tsmc.2018.2847625

Abstract

We propose a weighted outlier mining method called WATCH to identify outliers in high-dimensional categorical datasets. WATCH is composed of two distinctive modules: 1) feature grouping by the virtue of correlation measurement among features and 2) outlier mining by assigning scores to objects in each feature groups. At the heart of WATCH is the feature grouping module, which groups an array of features into multiple groups to discover various aspects of feature patterns in each group. The outlier mining module detects outliers from high-dimensional categorical datasets. Except for the number of outliers specified by users, WATCH is conducive to bypassing the optimization of any user-given parameter. We implement and evaluate WATCH using synthetic and real-world datasets. Our experimental results show that WATCH is a promising and practical algorithm to detect outliers in high-dimensional categorical datasets, because WATCH achieves high performance in terms of precision, efficiency, and interpretability.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society	Publication Date: Nov 1, 2020
Citations: 55	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society

Lead the way for us

Similar Papers

Feature grouping-based parallel outlier mining of categorical data using spark
Junli Li ... Yaling Xun
Information sciences | VOL. 504
Junli Li, et. al.Junli Li ... Yaling Xun
12 Jul 2019
Information sciences | VOL. 504

Mining Outliers in Correlated Subspaces for High Dimensional Data Sets
Jinsong Leng ... Tzung-Pei Hong
Fundamenta Informaticae | VOL. 98
Jinsong Leng, et. al.Jinsong Leng ... Tzung-Pei Hong
01 Jan 2009
Fundamenta Informaticae | VOL. 98

OUTLIERS DETECTION WITH CORRELATED SUBSPACES FOR HIGH DIMENSIONAL DATASETS
JINSONG LENG ... ZHIHU HUANG
International journal of wavelets, multiresolution and information processing | VOL. 09
JINSONG LENG, et. al.JINSONG LENG ... ZHIHU HUANG
01 Mar 2011
International journal of wavelets, multiresolution and information processing | VOL. 09

Outlier Mining Methods Based on Graph Structure Analysis
Pablo Amil ... Cristina Masoller
Frontiers in physics | VOL. 7
Pablo Amil, et. al.Pablo Amil ... Cristina Masoller
26 Nov 2019
Frontiers in physics | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society