Rare Category Analysis for Complex Data: A Review

Dawei Zhou,Jingrui He

doi:10.1145/3626520

Abstract

Though the sheer volume of data that is collected is immense, it is the rare categories that are often the most important in many high-impact domains, ranging from financial fraud detection in online transaction networks to emerging trend detection in social networks, from spam image detection on social media platforms to rare disease diagnosis in medical decision support systems. The unique challenges of rare category analysis include (1) the highly skewed class distribution; (2) the non-separable nature of the rare categories from the majority classes; (3) data and task heterogeneity; and (4) the time-evolving property of the input data sources. This survey reviews state-of-the-art techniques used in complex rare category analysis, where the majority classes have a smooth distribution while the minority classes exhibit the compactness property in the feature space or subspace. Rare category analysis aims to identify, characterize, represent, and interpret anomalies that not only show statistical significance but also exhibit interesting patterns (e.g., compactness, high-order structures, showing in a burst). We introduce our study, define the problem setting, and describe the unique challenges of complex rare category analysis. We then present a comprehensive review of recent advances that are designed for this problem setting, from rare category exploration without any label information to rare category exposition that characterizes rare examples with a compact representation, from the representation of rare patterns in a salient embedding space to the interpretation the prediction results and providing relevant clues for the end-users’ interpretation. Finally we discuss potential challenges and shed light on the future directions for complex rare category analysis. 1

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Rare Category Analysis for Complex Data: A Review

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys

Lead the way for us

Similar Papers

Gold Panning from the Mess
Dawei Zhou ... Jingrui He
-
Dawei Zhou, et. al.Dawei Zhou ... Jingrui He
25 Jul 2019
25 Jul 2019

SPARC
Dawei Zhou ... Wei Fan
-
Dawei Zhou, et. al.Dawei Zhou ... Wei Fan
19 Jul 2018
19 Jul 2018

Investigating the Community Detection Algorithm Using Computational Intelligence Base Method
K Velkumar ... P Thendral
-
K Velkumar, et. al.K Velkumar ... P Thendral
04 Aug 2021
04 Aug 2021

Folksonomy-based ad hoc community detection in online social networks
Vasanth Nair ... Sumeet Dua
Social Network Analysis and Mining | VOL. 2
Vasanth Nair, et. al.Vasanth Nair ... Sumeet Dua
23 Aug 2012
Social Network Analysis and Mining | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rare Category Analysis for Complex Data: A Review

Abstract

Talk to us

Similar Papers

More From: ACM Computing Surveys