Abstract

The number of malware attacks that use encrypted HTTP traffic to self-propagate or communicate has increased dramatically in recent years. Traditional network traffic detection methods for non-encrypted traffic are difficult to be applied to encrypted traffic detection. While encryption is good for protecting users' privacy, it also comes with a security risk: malicious traffic may hide in encrypted traffic, leading to a series of security problems. Since the encrypted payload cannot be directly observed and it is large, we need to combine domain knowledge and machine learning method to excavate the features hidden in malicious traffic, so as to realize automatic detection of malicious traffic. In this paper, we propose a malicious traffic detection framework based on ensemble learning using the encrypted traffic from real malware, including normal traffic generated by 3000 normal hosts and malicious traffic generated by 3000 malware-infected hosts, provided by DATACON2020 competition. Considering the complexity to decrypt traffic, we use statistical features and sequence features in both flow-level and in host-level perspectives to describe encrypted traffic. Then we build multiple classifiers according to heterogeneous features. Finally we do majority voting to get final result. We achieve 95.0% TPR and 8.4% FPR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call