Abstract
The development of the Internet has led to the complexity of network encrypted traffic. Identifying the specific classes of network encryption traffic is an important part of maintaining information security. The traditional traffic classification based on machine learning largely requires expert experience. As an end-to-end model, deep neural networks can minimize human intervention. This paper proposes the CLD-Net model, which can effectively distinguish network encrypted traffic. By segmenting and recombining the packet payload of the raw flow, it can automatically extract the features related to the packet payload, and by changing the expression of the packet interval, it integrates the packet interval information into the model. We use the ability of Convolutional Neural Network (CNN) to distinguish image classes, learn and classify the grayscale images that the raw flow has been preprocessed into, and then use the effectiveness of Long Short-Term Memory (LSTM) network on time series data to further enhance the model’s ability to classify. Finally, through feature reduction, the high-dimensional features learned by the neural network are converted into 8 dimensions to distinguish 8 different classes of network encrypted traffic. In order to verify the effectiveness of the CLD-Net model, we use the ISCX public dataset to conduct experiments. The results show that our proposed model can distinguish whether the unknown network traffic uses Virtual Private Network (VPN) with an accuracy of 98% and can accurately identify the specific traffic (chats, audio, or file) of Facebook and Skype applications with an accuracy of 92.89%.
Highlights
With the rapid development of the Internet, network applications and protocols emerge in an endless stream, making the types of network traffic more complex and diverse [1], which poses certain obstacles to network traffic management
Because the raw network traffic is usually stored in the format of .pcap or .pcapng files, it cannot be directly input to the neural network, and the neural network model requires that the length of the input data must be uniform, but the length of the raw network traffic is not uniform. erefore, data preprocessing is required to convert the raw network traffic into a grayscale image format, which is used as the input of the model, and the corresponding methods are called to train and evaluate the model to achieve the purpose of traffic classification. e preprocessing procedure mainly includes traffic split, traffic clean, traffic recombination, and traffic conversion
As can be seen from the figure, for distinguishing whether it is Virtual Private Network (VPN) data, the average accuracy, recall and F1score all exceed 0.97, and the average precision even exceeds 0.98. e results of the ten experiments are relatively stable, basically between 0.96 and 0.99, with little difference. e larger the value, the better the classification result, which means that true positive (TP) and true negative (TN) are significantly higher than false positive (FP) and false negative (FN)
Summary
With the rapid development of the Internet, network applications and protocols emerge in an endless stream, making the types of network traffic more complex and diverse [1], which poses certain obstacles to network traffic management. Network traffic classification and recognition is an important foundation for network detection and management and one of the key technologies for maintaining cyberspace security. E use of traffic encryption is a double-edged sword It improves and maintains the security and privacy of users, it makes the third party in the network link unable to use Deep Packet Inspection (DPI) technology to match and screen the key fields in the traffic load [3], which has caused some obstacles to the traffic review of the firewall. It inspires security personnel to introduce machine learning technology into the field of traffic analysis and analyze and study encrypted traffic from a statistical
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.