An Encrypted Traffic Classification Framework Based on Convolutional Neural Networks and Stacked Autoencoders

Maonan Wang,Dan Luo,Xiujuan Wang,Yanqing Yang,Kangfeng Zheng

doi:10.1109/iccc51575.2020.9344978

Abstract

In recent years, deep learning-based encrypted traffic classification has proven to be effective; especially, using neural networks to extract features from raw traffic to classify encrypted traffic. However, most of the neural networks need a fixed-sized input, so that the raw traffic need to be trimmed. This will cause the loss of some information; for example, we do not know the number of packets in a session. To solve these problems, a framework, which implements both a convolutional neural network (CNN) and a stacked autoencoder (SAE), is proposed in this paper. This framework uses a CNN to extract high-level features from raw network traffic and uses an SAE to encode the 26 statistical features calculated by raw traffic directly. The statistical features can be used to supplement the information loss due to trimming. After that, the outputs from the CNN and the encoder in SAE are combined into new high-level features; these new features include the information from the trimmed raw traffic and statistical features. Finally, these new high-level features are used to classify encrypted traffic. “ISCX VPNnonVPN” traffic dataset is used to demonstrate the feasibility of this framework. The framework proposed in this paper can improve the performance of encrypted traffic classification; it achieves an f1-score of 0.98. Furthermore, new high-level features, which generated by combining the features extracted from a convolutional neural network and a stacked autoencoder, can represent different classes of traffic well. More importantly, this work is unique in the encrypted traffic classification field, for it is the first time to use both raw traffic and statistical features as the input of the model.

Full Text