An Imbalanced Malicious Domains Detection Method Based on Passive DNS Traffic Analysis

Zhenyan Liu,Jiangtao Liu,Yifei Zeng,Jingfeng Xue,Pengfei Zhang,Ji Zhang

doi:10.1155/2018/6510381

Zhenyan Liu, Jiangtao Liu + Show 4 more

Open Access

https://doi.org/10.1155/2018/6510381

Copy DOI

Abstract

Although existing malicious domains detection techniques have shown great success in many real-world applications, the problem of learning from imbalanced data is rarely concerned with this day. But the actual DNS traffic is inherently imbalanced; thus how to build malicious domains detection model oriented to imbalanced data is a very important issue worthy of study. This paper proposes a novel imbalanced malicious domains detection method based on passive DNS traffic analysis, which can effectively deal with not only the between-class imbalance problem but also the within-class imbalance problem. The experiments show that this proposed method has favorable performance compared to the existing algorithms.

Highlights

With the rapid development of the Internet and information technology, network security threats are escalating, the security of cyberspace is becoming more and more complex and hidden, the risk of network security is increasing, and various network malicious attacks emerge endlessly
In order to verify the novel HAC EasyEnsemble algorithm used to learn imbalanced DNS traffic data, we do a series of experiments to compare the performance of HAC EasyEnsemble and EasyEnsemble based on the same dataset
In order to get the number of base classifiers T mentioned in Section 4 of HAC EasyEnsemble classification model, we firstly do a series of experiments

Summary

Introduction

With the rapid development of the Internet and information technology, network security threats are escalating, the security of cyberspace is becoming more and more complex and hidden, the risk of network security is increasing, and various network malicious attacks emerge endlessly. It is very popular to employ the classification algorithm in machine learning to detect malicious domains in the current research [1, 2] These existing studies pay no or little attention to the problem of imbalanced data. When learning from an imbalanced dataset, class information must be considered; otherwise the classifier will be overwhelmed by the majority classes and ignores the minority ones, and the overall classification performance will undoubtedly be degraded. To address this shortfall, this paper will propose an imbalanced malicious domains detection method which can build malicious domains detection model by learning imbalanced dataset based on passive DNS traffic analysis.

Related Work

Profiling Malicious Domains Based on Passive DNS Traffic Analysis

An Imbalanced Malicious Domains Detection Method

Experiment

Findings

Conclusions