Abstract

The rapidly growing encrypted traffic hides a large number of malicious behaviours. The difficulty of collecting and labelling encrypted traffic makes the class distribution of dataset seriously imbalanced, which leads to the poor generalisation ability of the classification model. To solve this problem, a new representation learning method in encrypted traffic and its diversity enhancement model are proposed, which uses the diversity of images to represent the diversity of traffic samples. First, the encrypted traffic is transformed into Markov images. Then, a diversity maximisation Markov-GAN based on the Simpson index is designed to generate new Markov images. Finally, the balanced Markov image set is sent to the CNN for classification. Experimental results show that the proposed method can predict the whole dataset space with only a few original samples. And the classification accuracies under different imbalance degrees are significantly improved, all of which are over 90%. The enhanced Markov image set can effectively alleviate performance generalisation deviation caused by different network depths. Even an ordinary CNN has almost the same classification effect as VGG13 and VGG16. Compared with other data enhancement methods, the Markov-GAN only needs to balance the transform domain dataset, which is lightweight, easy to train and has stronger amplification ability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call