Abstract

The fine-grained classification of encrypted traffic is important for network security analysis. Malicious attacks are usually encrypted and simulated as normal application or content traffic. Supervised machine learning methods are widely used for traffic classification and show good performances. However, they need a large amount of labeled data to train a model, while labeled data is hard to obtain. Aiming at solving this problem, this paper proposes a method to train a model based on the K-nearest neighbor (KNN) algorithm, which only needs a small amount of data. Due to the fact that the importance of different traffic features varies, and traditional KNN does not highlight the importance of different features, this study introduces the concept of feature weight and proposes the weighted feature KNN (WKNN) algorithm. Furthermore, to obtain the optimal feature set and the corresponding feature weight set, a feature selection and feature weight self-adaptive algorithm for WKNN is proposed. In addition, a three-layer classification framework for encrypted network flows is established. Based on the improved KNN and the framework, this study finally presents a method for fine-grained classification of encrypted network flows, which can identify the encryption status, application type and content type of encrypted network flows with high accuracies of 99.3%, 92.4%, and 97.0%, respectively.

Highlights

  • Traffic-classification technology plays an important role in network security defense mechanisms.It is the basis for analyzing network traffic, detecting network anomalies, and balancing network load [1]

  • Considering the different effects of different features, this study introduces the concept of feature weight and proposes a weighted feature K-nearest neighbor (KNN) (WKNN) algorithm

  • According to the three classification layers in FCE-KNN, this section is divided into three subsections, including analyzing the performance of FCE-KNN in identifying the encryption status, application type, and content type of encrypted flows, respectively

Read more

Summary

Introduction

Traffic-classification technology plays an important role in network security defense mechanisms. It is the basis for analyzing network traffic, detecting network anomalies, and balancing network load [1]. The fine-grained classification including the analysis of application and content types of encrypted traffic is an important research area [7]. The machine-learning method needs a great amount of labeled data to train a model in terms of achieving fine-grained classification [9], and it is difficult to realize in an actual network for the reasons that labeled data are hard to obtain [10] and the model should be updated periodically for coping with concept drift [11,12]

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call