Abstract
The fine-grained classification of encrypted traffic is important for network security analysis. Malicious attacks are usually encrypted and simulated as normal application or content traffic. Supervised machine learning methods are widely used for traffic classification and show good performances. However, they need a large amount of labeled data to train a model, while labeled data is hard to obtain. Aiming at solving this problem, this paper proposes a method to train a model based on the K-nearest neighbor (KNN) algorithm, which only needs a small amount of data. Due to the fact that the importance of different traffic features varies, and traditional KNN does not highlight the importance of different features, this study introduces the concept of feature weight and proposes the weighted feature KNN (WKNN) algorithm. Furthermore, to obtain the optimal feature set and the corresponding feature weight set, a feature selection and feature weight self-adaptive algorithm for WKNN is proposed. In addition, a three-layer classification framework for encrypted network flows is established. Based on the improved KNN and the framework, this study finally presents a method for fine-grained classification of encrypted network flows, which can identify the encryption status, application type and content type of encrypted network flows with high accuracies of 99.3%, 92.4%, and 97.0%, respectively.
Highlights
Traffic-classification technology plays an important role in network security defense mechanisms.It is the basis for analyzing network traffic, detecting network anomalies, and balancing network load [1]
Considering the different effects of different features, this study introduces the concept of feature weight and proposes a weighted feature K-nearest neighbor (KNN) (WKNN) algorithm
According to the three classification layers in FCE-KNN, this section is divided into three subsections, including analyzing the performance of FCE-KNN in identifying the encryption status, application type, and content type of encrypted flows, respectively
Summary
Traffic-classification technology plays an important role in network security defense mechanisms. It is the basis for analyzing network traffic, detecting network anomalies, and balancing network load [1]. The fine-grained classification including the analysis of application and content types of encrypted traffic is an important research area [7]. The machine-learning method needs a great amount of labeled data to train a model in terms of achieving fine-grained classification [9], and it is difficult to realize in an actual network for the reasons that labeled data are hard to obtain [10] and the model should be updated periodically for coping with concept drift [11,12]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.