A Cause-Based Classification Approach for Malicious DNS Queries Detected Through Blacklists

Akihiro Satoh,Yutaka Fukuda,Yutaka Nakamura,Kazuto Sasai,Gen Kitagata

doi:10.1109/access.2019.2944203

Abstract

Some of the most serious security threats facing computer networks involve malware. To prevent this threat, administrators need to swiftly remove the infected machines from their networks. One common way to detect infected machines in a network is by monitoring communications based on blacklists. However, detection using this method has the following two problems: no blacklist is completely reliable, and blacklists do not provide sufficient evidence to allow administrators to determine the validity and accuracy of the detection results. Therefore, simply matching communications with blacklist entries is insufficient, and administrators should pursue their detection causes by investigating the communications themselves. In this paper, we propose an approach for classifying malicious DNS queries detected through blacklists by their causes. This approach is motivated by the following observation: a malware communication is divided into several transactions, each of which generates queries related to the malware; thus, surrounding queries that occur before and after a malicious query detected through blacklists help in estimating the cause of the malicious query. Our cause-based classification drastically reduces the number of malicious queries to be investigated because the investigation scope is limited to only representative queries in the classification results. In experiments, we have confirmed that our approach could group 388 malicious queries into 3 clusters, each consisting of queries with a common cause. These results indicate that administrators can briefly pursue all the causes by investigating only representative queries of each cluster, and thereby swiftly address the problem of infected machines in the network.

Highlights

Some of the most serious security threats facing computer networks involve malware
We propose a novel approach for classifying malicious DNS queries detected through blacklists by their causes
This approach is motivated by the following important observation: a malware communication is divided into several transactions, each of which generates queries related to the malware; surrounding queries that occur before and after a malicious query detected through blacklists help in estimating the cause of the malicious query

Summary

Introduction

Some of the most serious security threats facing computer networks involve malware. Cyber-criminals use malware-infected machines to undertake malicious activities such as stealing confidential information, spreading malware to additional machines, and phishing to an organization. One common way to detect infected machines in a network is by monitoring communications based on blacklists. This method detects the suspected machines of infecting malware by matching communications with blacklist entries. To improve the detection capability of this method, several studies have attempted to automatically update blacklist entries by using machine learning techniques [2], [3]. Detection using this method has the following two problems [4], [5]: (1) no blacklist is completely reliable, and (2) blacklists do not provide sufficient evidence to allow administrators to determine the validity and accuracy of the detection results.

Objectives

Results

Conclusion