A Novel Autoencoder Deep Architecture for Detecting the Outlier in Heterogeneous Data Sets

Satarupa Uttarkabat,Abhaya Sharma,Bidyut Kumar Patra

doi:10.1142/s2424922x22500115

Abstract

Neighborhood-based unsupervised approaches like LDOF, LOF and symmetric neighborhood (INFLO) have proven effective over decades. These techniques principally utilize the information of either the [Formula: see text]-nearest neighbor or the reverse [Formula: see text]-nearest neighbors to detect the outlierness of each object in a data set. However, these methodologies fail to detect genuine outliers in heterogeneous data sets located between two dense clusters, between dense and sparse clusters or the scattered data sets. In addition, LOF treats a normal point of a sparse cluster as an outlier if the sparse cluster is close to a dense cluster. This paper proposes a novel autoencoder deep learning architecture to overcome the limitations of the aforementioned techniques. In the proposed approach, we identify the potential outliers intelligently from a given data set and mark them to generate training samples for the autoencoder. These marked points are not included in the training samples. Finally, the trained autoencoder is used to compute the outlierness of each data point in the whole data set (training samples + marked points). Experimental results with synthetic and real-world data sets show that the proposed model outperforms these widely applied techniques along with state-of-the-art works.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Novel Autoencoder Deep Architecture for Detecting the Outlier in Heterogeneous Data Sets

Abstract

Talk to us

Similar Papers

More From: Advances in Data Science and Adaptive Analysis

Lead the way for us

Similar Papers

Clustering by Searching Density Peaks via Local Standard Deviation
Juanying Xie ... Weiliang Jiang
-
Juanying Xie, et. al.Juanying Xie ... Weiliang Jiang
01 Jan 2017
01 Jan 2017

Linear clustering with application to single nucleotide polymorphism genotyping

-

01 Jan 2008
01 Jan 2008

An Adaptive Clustering Algorithm by Finding Density Peaks
Juanying Xie ... Weiliang Jiang
-
Juanying Xie, et. al.Juanying Xie ... Weiliang Jiang
01 Jan 2018
01 Jan 2018

Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets
Jia Zhao ... Ivan Lee
Pattern Recognition | VOL. 139
Jia Zhao, et. al.Jia Zhao ... Ivan Lee
12 Feb 2023
Pattern Recognition | VOL. 139

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Autoencoder Deep Architecture for Detecting the Outlier in Heterogeneous Data Sets

Abstract

Talk to us

Similar Papers

More From: Advances in Data Science and Adaptive Analysis