Short Text Classification and Clustering based Mobile Application Traffic Identification Method

Yuanhao Li,Shuhui Chen,Shuang Zhao

doi:10.1088/1742-6596/1616/1/012109

Yuanhao Li, Shuhui Chen + Show 1 more

Open Access

https://doi.org/10.1088/1742-6596/1616/1/012109

Copy DOI

Abstract

With the rapid development of mobile network, mobile traffic accounts for a large proportion of network traffic nowadays, and mobile application traffic identification is becoming increasingly important in network security. For mobile application traffic identification, recent works have focused on proposing supervised classifiers that have shown promising performance. However, it is difficult to obtain labeled traffic in practice and most of the traffic is unlabeled. In this paper, we propose a semi-supervised mobile application traffic identification method based on short text classification and clustering, which requires only a small number of labeled samples to classify traffic. First, plain text in mobile traffic is regarded as short text and its features are extracted using a short text classification algorithm. Then K-Means++ is used to cluster the samples and give the predictions. Experimental results show that the proposed method is more effective than the semi-supervised traffic identification methods that use statistical features.

Full Text