Abstract

Mobile-app identification over encrypted network traffic plays a vital role in network management, cyberspace security, and advertising analysis. Combining machine learning algorithms and traffic features is the mainstream approach, which assumes the training and test traffic is independent and identically distributed (i.i.d). However, the distribution of test flows could drift due to the updating app, diverse mobile platforms, and regions, resulting in low accuracy. Existing methods relabel the non-i.i.d traffic manually and retrain the model from scratch, which takes lots of human effort and thus limits deployment. In this paper, we propose a Flow Domain Adaptation Neural Network (FDAN) to improve the accuracy in identifying mobile apps over non-i.i.d traffic under zero-relabelling. FDAN transforms the drifted test flows into approximate i.i.d samples by reducing the differences between traffic distributions in a trainable feature space. Specifically, we adopt two domain discriminators and a feature generator to strengthen the feature’s invariance under an adversarial loss. An app predictor is used to enhance the discrimination with supervised data from the source domain. We conduct extensive experiments on public and private datasets, including changing app versions, platforms, and regions. With theoretical guarantees, our FDAN achieves a remarkable improvement (7.8% ∼47%↑ in F1) with zero-relabelling and outperforms other comparisons.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call