Abstract
Network traffic classification has long been a pivotal topic in network security. In the past two decades, methods like port-based classification, deep packet inspection, and machine learning approaches have significantly progressed. Still, they are now facing reduced effectiveness due to the evolving complexity of the Internet, new encryption protocols, and advanced defense strategies. Given the problem that traditional models cannot efficiently generalize encrypted traffic, two promising technology paths are currently: deep learning and pre-training. On the one hand, deep learning-based methods effectively dissect complex network structures and unearth pivotal relational patterns. These approaches excel due to the neural networks’ robust generalization capabilities, significantly boosting the accuracy and efficiency of recognition processes. Graph representation learning stands out as the most compelling contemporary model for such intricate analysis, adeptly revealing the critical relationships within network communication structures. We emphatically introduce mainstream deep learning-based methods, and the mechanism and scenarios are also analyzed. On the other hand, recognizing that although the analysis based on large models is the trend of the field, the application is truly limited now, we underscore the importance of pre-training, which aligns with the future trajectory toward the adoption of large-scale models in encrypted traffic analysis. The pre-trained model can overcome various defects of previous models and achieve more remarkable performance through its low labeled data dependency and strong scenario adaptability. We provide a comprehensive overview of existing pre-training-based approaches from the three stages of operation: input, pre-training, fine-tuning, and comparing representative relevant work. Finally, because of the current needs and the improvement space of the existing pre-training methods in the field, we synthetically analyze the challenges and opportunities for interested researchers to explore.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.