Detecting Malicious Domains with Behavioral Modeling and Graph Embedding

Kai Lei,Feiyang Wang,Min Yang,Jiake Ni,Qiuai Fu,Kuai Xu

doi:10.1109/icdcs.2019.00066

Abstract

The last decade has witnessed the explosive growth of malicious Internet domains which serve as the fundamental infrastructure for establishing advanced persistent threat command and control communication channels or hosting phishing Web sites. Given the big data nature of Internet traffic data and the ability of algorithmically generating domains and acquiring and registering the domains in a near-automated fashion, detecting malicious domains in real-time is a daunting task for security analysts and network operators. In this paper, we introduce bipartite graphs to capture the interactions between end hosts and domains, identify associated IP addresses of domains, and characterize time-series patterns of DNS queries for domains, and explore one-mode projections of these bipartite graphs for modeling the behavioral, IP-structural, and temporal similarities between domains. We employ graph embedding technique to automatically learn dynamic and discriminative feature representations for over 10,000 labeled domains, and develop an SVM-based classification algorithm for predicting malicious or benign domains. Our model makes the progress towards adapting to the changing and evolving strategies of malicious domains. The experimental results have shown that our proposed algorithm achieves an area under the curve (AUC) of 0.94 based on k-fold cross-validation. To the best of our knowledge, this is the first effort to apply the combination of behavioral modeling and graph embedding for effectively and accurately detecting malicious domains.

Full Text