Abstract
In recent years, the emergence of distributed machine learning has enabled deep learning models to ensure data security and privacy while training efficiently. Anomaly detection for network traffic in distributed machine learning scenarios is of great significance for network security. Although deep neural networks have made remarkable achievements in anomaly detection for network traffic, they mainly focus on closed sets, that is, assuming that all anomalies are known. However, in a real network environment, unknown abnormalities are fatal risks faced by the system because they have no labels and occur before the known anomalies. In this study, we design and implement XFinder, a dynamic unknown traffic anomaly detection framework in distributed machine learning. XFinder adopts an online mode to detect unknown anomalies in real-time. XFinder detects unknown anomalies by the unknowns detector, transfers the unknown anomalies to the prior knowledge base by the network updater, and adopts the online mode to report new anomalies in real-time. The experimental results show that the average accuracy of the unknown anomaly detection of our model is increased by 27% and the average F1-Score is improved by 20%. Compared with the offline mode, XFinder’s detection time is reduced by an average of approximately 33% on three datasets, and can better meet the network requirement.
Highlights
With the rapid development of big data, data privacy and security have attracted more and more public attention
The experimental results show that the average accuracy of the unknown anomaly detection of our model is increased by 27% and the average F1-Score is improved by 20%
We can see that both accuracy and f1-score performance are better in the online mode than in the offline mode, the average accuracy of the unknown anomaly detection of the model is increased by 27%, and the average F1-score is increased by 20%
Summary
With the rapid development of big data, data privacy and security have attracted more and more public attention. With the increase of data size and model complexity, it becomes more and more difficult for a single server to accomplish a machine learning task. To address the problem, distributed machine learning is developed. Distributed machine learning (Galakatos et al, 2018) uses multi-node machine learning algorithms and systems, which are designed to improve performance, accuracy, and scale to larger input data sizes. In this scheme, the whole dataset is divided into several subsets and stored distributedly on nodes. Network traffic is the carrier of information transmission and interaction in cyberspace, anomaly detection for XFinder: Detecting Unknown Anomalies network traffic in a distributed machine learning scenarios is of great significance for network security
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.