Virtual Private Network(VPN) can provide a concealed transmission channel for communication and protect the privacy of users. However, it also brings hidden dangers to cybersecurity with its wide application. Malicious behavior or harmful information can be transmitted secretly through VPN tunnels to avoid firewall censorship. Therefore, VPN traffic identification is an important part of ensure cybersecurity. Although many efforts have been made for VPN traffic identification, existing methods mainly focus on supervised learning models. In this paper, we propose an one-class classification model called AAE-DSVDD for VPN traffic identification. First, we introduce Adversarial AutoEncoder(AAE) for preliminary modeling of VPN traffic. AAE can match the aggregated posterior distribution of the hidden layer to an arbitrary prior distribution. It associates the samples with a normal distribution in the hidden space. Secondly, We implement representation learning for VPN traffic via Deep Support Vector Data Description(DSVDD). A standardized method is designed to match the output distribution of DSVDD with the aggregated posterior distribution of AAE. It alleviates the hypersphere collapse problem of DSVDD and improves identification performance. Finally, we verify the abilities of the AAE-DSVDD model on the public dataset ISCXVPN. Compared with other one-class models, AAE-DSVDD achieved the best identification ability for VPN traffic identification. It also improves the recognition ability when identifying strange classes that are not included in the training data.
Read full abstract