Abstract

The availability of an enormous amount of unlabeled datasets drives the anomaly detection research towards unsupervised machine learning algorithms. Deep clustering algorithms for anomaly detection gain significant research attention in this era. We propose an intelligent anomaly detection for extensive network traffic analysis with an Optimized Deep Clustering (ODC) algorithm. Firstly, ODC does the optimization of the deep AutoEncoder algorithm by tuning the hyperparameters. Thereby we can achieve a reduced reconstruction error rate from the deep AutoEncoder. Secondly, ODC feeds the optimized deep AutoEncoder's latent view to the BIRCH clustering algorithm to detect the known and unknown malicious network traffic without human intervention. Unlike other deep clustering algorithms, ODC does not require to specify the number of clusters needed to analyze the network traffic dataset. We experiment ODC algorithm with the CoAP off-path dataset obtained from our testbed and the MNIST dataset to compare our algorithm's accuracy with state-of-art clustering algorithms. The evaluation results show ODC deep clustering method outperforms the existing deep clustering methods for anomaly detection.

Highlights

  • Network traffic increase is directly proportional to increasing malicious activities on the internet

  • With the observations of background studies and the research gap learned from related work in Section III, we proposed our Optimized Deep Clustering (ODC) in SectionIV for intelligent anomaly detection

  • To evaluate our algorithm ODC, we use CoAP off-path dataset [5] to find out the anomalies in IoT network traffic and the standard publicly available MNIST [43] image dataset to compare the accuracy of ODC results with other existing works

Read more

Summary

INTRODUCTION

Network traffic increase is directly proportional to increasing malicious activities on the internet. Augmentation (DDC-DA) [10] use convolutional AutoEncoder with k-means clustering, Gaussian mixture variational AutoEncoder (GMVAE) [12] practices variational AutoEncoder with k-means clustering Most of these deep clustering techniques use the k-means clustering algorithm for the data clustering part, which in turn demands the number of clusters manually. In a real-time situation, predicting the number of clusters at the initial time (training the model) for a new dataset might not help discover new and unknown anomalies. To overcome this major limitation of the existing works, we use BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) in our ODC deep clustering technique.

BACKGROUND
DEEP AUTOENCODER
BIRCH CLUSTERING ALGORITHM
ENHANCED DEEP AUTOENCODER
OPTIMIZED DEEP CLUSTERING WITH BIRCH
EXPERIMENTAL EVALUATION
Continue scanning data and insert into t1
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call