An improved X-means and isolation forest based methodology for network traffic anomaly detection.

Yifan Feng,Zijun Hu,Haoyu Yue,Yan Lin,Weihong Cai,Jiaxin Chen,Jianlong Xu

doi:10.1371/journal.pone.0263423

Abstract

Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy of the abnormal ratio in the training set as prior knowledge has a great influence on the performance of the commonly used unsupervised algorithms. In this study, an anomaly detection algorithm based on X-means and iForest is proposed, named X-iForest, which clusters the standard Euclidean distance between the abnormal points and the normal cluster centre to achieve secondary filtering by using X-means. We compared X-iForest with seven mainstream unsupervised algorithms in terms of the AUC and anomaly detection rates. A large number of experiments showed that X-iForest has notable advantages over other algorithms and can be well applied to anomaly detection of large-scale network traffic data.

Highlights

In recent years, the network environment has become increasingly complex
The complex network environment and the surge of traffic data make the detection of network traffic anomalies a considerable challenge facing enterprises today
In addition to the explosive growth of traffic data, the current unsupervised anomaly detection algorithms commonly used in industrial applications cannot be well implemented in a real complex network environment

Summary

Related work

Following extensive investigations of actual industrial applications and recently published articles in the field of network health analysis and network traffic anomaly detection, the main methods can be classified as follows. X-means and isolation forest based methodology for network traffic anomaly detection and proposed a new general formula for distance calculation and a PCA-based IoT detection method. They verified the feasibility of their proposed method through a variety of experiments. Distance-based approaches incur a very high computational cost for massive datasets, with loss of performance when applied to network traffic anomaly detection These approaches introduce the concept of LOF, in which each instance is assigned a score based on the neighbours’ local density denoting a degree of outlierness.

Materials and methods

Experiments Evaluation metric

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Jan 31, 2022
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An improved X-means and isolation forest based methodology for network traffic anomaly detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

6D Object Localization in Car-Assembly Industrial Environment
Alexandra Papadaki ... Maria Pateraki
Journal of Imaging | VOL. 9
Alexandra Papadaki, et. al.Alexandra Papadaki ... Maria Pateraki
20 Mar 2023
Journal of Imaging | VOL. 9

Multi-scale one-dimensional convolution tool wear monitoring based on multi-model fusion learning skills
Wei Ma ... Steven Y Liang
Journal of Manufacturing Systems | VOL. 70
Wei Ma, et. al.Wei Ma ... Steven Y Liang
18 Jul 2023
Journal of Manufacturing Systems | VOL. 70

Automated generation of semi-labeled training samples for nonlinear neural network-based abundance estimation in hyperspectral data
J Plaza ... R Perez
-
J Plaza, et. al.J Plaza ... R Perez
25 Jul 2005
25 Jul 2005

Comparative high temperature oxidation studies of HVOF IN 625 coating on T22 boiler steel at 900 °C and 700 °C
Rajan Verma ... Gagandeep Kaushal
Materials Today: Proceedings | VOL. 41
Rajan Verma, et. al.Rajan Verma ... Gagandeep Kaushal
10 Oct 2020
Comparative high temperature oxidation studies of HVOF IN 625 coating on T22 boiler steel at 900 °C and 700 °C
Rajan Verma ... Gagandeep Kaushal

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An improved X-means and isolation forest based methodology for network traffic anomaly detection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE