Abstract

The proliferation of handheld devices has led to an explosive growth of mobile traffic volumes on the Internet. Identifying mobile apps from network traffic has become a crucial task for mobile network management and security. Traditionally, the design of accurate identifiers relies on the deep packet inspection (DPI) techniques. However, such approaches have become less effective with the raising adoption of encrypted protocols in mobile applications (mostly TLS). To address the problem, various machine learning methods have been studied and used. Most of them use linear classifiers on top of hand-engineered features, which are unreliable due to the complexity of mobile traffic. In this article we propose App-Net, an end-to-end hybrid neural network for mobile app identification from encrypted TLS traffic. App-Net is designed by combining RNN and CNN in a parallel way and can automatically learn effective features from raw TLS flows. With coordinated fusion and optimized training, the hybrid and multimodal architecture is able to characterize both flow sequence patterns and app signatures to learn a joint flow-app embedding. We evaluate App-Net on a real-world dataset covering 80 apps. The results show that our method can achieve an excellent performance and outperform the state-of-the-art methods.

Highlights

  • In order to gain visibility and control over applications traversing the network, network operators need to identify an application by the traffic it generates

  • We focus on mobile app identification from encrypted TLS traffic with deep learning (DL) approaches

  • App-Net consists of an long short-term memory (LSTM) recurrent neural network to learn effective features from raw flow sequences, a convolutional neural network to extract byte signatures from the initial TLS packet payload, and a feature fusion stage to learn coordinated and joint representations to take advantage of both features drawn from RNN and CNN

Read more

Summary

INTRODUCTION

In order to gain visibility and control over applications traversing the network, network operators need to identify an application by the traffic it generates. App-Net consists of an LSTM recurrent neural network to learn effective features from raw flow sequences, a convolutional neural network to extract byte signatures from the initial TLS packet payload, and a feature fusion stage to learn coordinated and joint representations to take advantage of both features drawn from RNN and CNN. Such representations are used for APP-ID in the end.

RELATED WORK
THE LSTM PATH
EXPERIMENT SETTINGS
2) EVALUATION METRICS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call