Data-Driven Network Analysis for Anomaly Traffic Detection.

Shumon Alam,Suxia Cui,Cajetan Akujuobi,Yasin Alam

doi:10.3390/s23198174

Abstract

Cybersecurity is a critical issue in today's internet world. Classical security systems, such as firewalls based on signature detection, cannot detect today's sophisticated zero-day attacks. Machine learning (ML) based solutions are more attractive for their capabilities of detecting anomaly traffic from benign traffic, but to develop an ML-based anomaly detection system, we need meaningful or realistic network datasets to train the detection engine. There are many public network datasets for ML applications. Still, they have limitations, such as the data creation process and the lack of diverse attack scenarios or background traffic. To create a good detection engine, we need a realistic dataset with various attack scenarios and various types of background traffic, such as HTTPs, streaming, and SMTP traffic. In this work, we have developed realistic network data or datasets considering various attack scenarios and diverse background/benign traffic. Furthermore, considering the importance of distributed denial of service (DDoS) attacks, we have compared the performance of detecting anomaly traffic of some classical supervised and our prior developed unsupervised ML algorithms based on the convolutional neural network (CNN) and pseudo auto-encoder (AE) architecture based on the created datasets. The results show that the performance of the CNN-Pseudo-AE is comparable to that of many classical supervised algorithms. Hence, the CNN-Pseudo-AE algorithm is promising in actual implementation.

Full Text