Network traffic classification using deep convolutional recurrent autoencoder neural networks for spatial–temporal features extraction

Gianni D’Angelo,Francesco Palmieri

doi:10.1016/j.jnca.2020.102890

Abstract

The right choice of features to be extracted from individual or aggregated observations is an extremely critical factor for the success of modern network traffic classification approaches based on machine learning. Such activity, usually in charge of the designers of the classification scheme is strongly related to their experience and skills, and definitely characterizes the whole approach, implementation strategy as well as its performance. The main aim of this work is supporting this process by mining new and more expressive, meaningful and discriminating features from the basic ones without human intervention. For this purpose, a novel autoencoder-based deep neural network architecture is proposed where multiple autoencoders are embedded with convolutional and recurrent neural networks to elicit relevant knowledge about the relations existing among the basic features (spatial-features) and their evolution over time (temporal-features). Such knowledge, consisting in new properties that are not immediately evident and better represent the most hidden and representative traffic dynamics can be successfully exploited by machine learning-based classifiers. Different network combinations are analyzed both from a theoretical perspective, and through specific performance evaluation experiments on a real network traffic dataset. We show that the traffic classifier obtained by stacking the autoencoder with a fully-connected neural network, achieves up to a 28% improvement in average accuracy over state-of-the-art machine learning-based approaches, up to a 10% over pure convolutional and recurrent stacked neural networks, and 18% over pure feed-forward networks. It is also able to maintain high accuracy even in the presence of unbalanced training datasets.

Full Text