Unsupervised Anomaly Detection Based on Deep Autoencoding and Clustering

Chuanlei Zhang,Xiaoning Yan,Wei Chen,Minda Yao,Jinyuan Shi,Nenghua Xu,Dufeng Chen,Jiangtao Liu

doi:10.1155/2021/7389943

Abstract

The unsupervised anomaly detection task based on high-dimensional or multidimensional data occupies a very important position in the field of machine learning and industrial applications; especially in the aspect of network security, the anomaly detection of network data is particularly important. The key to anomaly detection is density estimation. Although the methods of dimension reduction and density estimation have made great progress in recent years, most dimension reduction methods are difficult to retain the key information of original data or multidimensional data. Recent studies have shown that the deep autoencoder (DAE) can solve this problem well. In order to improve the performance of unsupervised anomaly detection, we propose an anomaly detection scheme based on a deep autoencoder (DAE) and clustering methods. The deep autoencoder is trained to learn the compressed representation of the input data and then feed it to clustering approach. This scheme makes full use of the advantages of the deep autoencoder (DAE) to generate low-dimensional representation and reconstruction errors for the input high-dimensional or multidimensional data and uses them to reconstruct the input samples. The proposed scheme could eliminate redundant information contained in the data, improve performance of clustering methods in identifying abnormal samples, and reduce the amount of calculation. To verify the effectiveness of the proposed scheme, massive comparison experiments have been conducted with traditional dimension reduction algorithms and clustering methods. The results of experiments demonstrate that, in most cases, the proposed scheme outperforms the traditional dimension reduction algorithms with different clustering methods.

Highlights

Detection is a very important branch of machine learning, with a wide range of practical applications, and it aims to detect special points in data. It is suitable for fault diagnosis [1, 2], system health monitoring [3], network security detection [4], intrusion and fraud detection [5,6,7], measurement, and other fields. e exceptions to the normal instances are called anomalies, so anomalies are called exceptions, outliers, novelties, noises, and deviations [8]. e so-called anomaly detection is to find objects that are different from most objects. e three objects O1, O2, and O3 in Figure 1 are different from most of the objects in N1 and N2 classes. e deviation is different for different applications
Some traditional dimension reduction methods, like Linear Discriminant Analysis (LDA), least absolute shrinkage and selection operator (LASSO), Locally Linear Embedding (LLE), Principal Component Analysis (PCA), Independent Principal Component Analysis (ICA), and Multidimensional Scale Transformation (MDS), are employed to process data, but, in the process of dimension reduction, some key information of the original data will be lost, which reduces the difference between normal samples and abnormal samples
According to the above analysis, we propose an anomaly detection scheme based on deep autoencoder. e following contributions are made to the unsupervised anomaly detection of high-dimensional data: (i) A dimension reduction method based on deep autoencoder and reconstruction of input samples is proposed. e deep autoencoder is used to reduce the dimension of the data, and the combination of the dimension reduction result and the reconstruction error forms a low-dimensional reconstruction input sample. e key information of the data is well preserved in the low-dimensional reconstruction input samples, which makes it easier to identify abnormal samples

Summary

Introduction

Detection is a very important branch of machine learning, with a wide range of practical applications, and it aims to detect special points in data. It is suitable for fault diagnosis [1, 2], system health monitoring [3], network security detection [4], intrusion and fraud detection [5,6,7], measurement, and other fields. When deep neural networks have achieved good results in other fields, the dimensional disaster of data in anomaly detection seems to come to a turning point. The deep autoencoding Gaussian mixture model [10] has shown good performance on public datasets, providing a new direction for high-dimensional data anomaly detection

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Security and Communication Networks	Publication Date: Oct 13, 2021
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Unsupervised Anomaly Detection Based on Deep Autoencoding and Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks

Lead the way for us

Similar Papers

Research on unsupervised anomaly data detection method based on improved automatic encoder and Gaussian mixture model
Xiangyu Liu ... Fan Yang
Journal of Cloud Computing | VOL. 11
Xiangyu Liu, et. al.Xiangyu Liu ... Fan Yang
29 Sep 2022
Journal of Cloud Computing | VOL. 11

Investigating the Efficacy of Nonlinear Dimensionality Reduction Schemes in Classifying Gene and Protein Expression Studies
G Lee ... C Rodriguez
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 5
G Lee, et. al.G Lee ... C Rodriguez
01 Jul 2008
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 5

Visual exploration of high-dimensional data using dimensionality reduction
Youngjoo Kim
-
Youngjoo KimYoungjoo Kim
20 Jun 2023
20 Jun 2023

Efficient Non-parametric Neural Density Estimation and Its Application to Outlier and Anomaly Detection
Joseph A Gallego-Mejia
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Joseph A Gallego-MejiaJoseph A Gallego-Mejia
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised Anomaly Detection Based on Deep Autoencoding and Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Security and Communication Networks