Malicious Domain Names Detection Algorithm Based on Lexical Analysis and Feature Quantification

Hong Zhao,Weijie Wang,Zhaobin Chang,Xiangyan Zeng

doi:10.1109/access.2019.2940554

Hong Zhao, Weijie Wang + Show 2 more

Open Access

https://doi.org/10.1109/access.2019.2940554

Copy DOI

Abstract

Malicious domain names usually refer to a series of illegal activities, posing threats to people's privacy and property. Therefore, the problem of detecting malicious domain names has aroused widespread concerns. In this study, a malicious domain names detection algorithm based on lexical analysis and feature quantification is proposed. To achieve efficient and accurate detection, the method includes two phases. The first phase checks an observed domain name against a blacklist of known malicious uniform resource locator (URLs). The observed domain name is classified as being definitely malicious or potentially malicious based on its edit distances to the domain names on the blacklist. The second phase further evaluates a potential malicious domain name by its reputation value that represents its lexical feature and is calculated based on an N-gram model. The top 100,000 normal domain names in Alexa are used to obtain a whitelist substring set using the N-gram method in which each domain name excluding the top-level domain is segmented into substrings with the length of 3, 4, 5, 6 and 7. The weighted values of the substrings are calculated according to their occurrence counts in the whitelist substring set. A potential malicious domain name is segmented by the N-gram method and its reputation value is calculated based on the weighted values of its substrings. Finally, the potential malicious domain name is determined to be malicious or normal based on its reputation value. The effectiveness of the proposed detection method has been demonstrated by experiments on public available data.

Highlights

Malicious domain names are widely used by attackers for illegal activities in Domain Name System (DNS)
OVERVIEW Fig. 1 presents the architecture of malicious domain names detection algorithm based on lexical analysis and feature quantification, which consists of two components: construction of domain name whitelist substring set and detection of malicious domain names
The observed domain name is identified as malicious if its edit distance to the domain names on the blacklist is less than a threshold value, otherwise it is considered to be potential malicious

Summary

Introduction

Malicious domain names are widely used by attackers for illegal activities in Domain Name System (DNS). As shown in some reports [1], [2]. The number of malicious domain names has grown to the point where they cannot be ignored. The detection of malicious domain names plays a major role in ensuring the network security. DNS, a core component of the Internet that provides flexible decoupling of a service’s domain name and the hosting IP addresses, has been widely used in network communications, e-business, and mess media [3]. Almost all Internet applications need to use DNS to resolve domain names and achieve

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 16	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Malicious Domain Names Detection Algorithm Based on Lexical Analysis and Feature Quantification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Malicious Domain Names Detection Algorithm Based on N-Gram
Hong Zhao ... Xiangyan Zeng
Journal of Computer Networks and Communications | VOL. 2019
Hong Zhao, et. al.Hong Zhao ... Xiangyan Zeng
03 Feb 2019
Journal of Computer Networks and Communications | VOL. 2019

Detecting Multielement Algorithmically Generated Domain Names Based on Adaptive Embedding Model
Luhui Yang ... Jiangtao Zhai
Security and Communication Networks | VOL. 2021
Luhui Yang, et. al.Luhui Yang ... Jiangtao Zhai
31 May 2021
Security and Communication Networks | VOL. 2021

Detection of Malicious URLs using Machine Learning based on Lexical Features
Prabodha Y Abeynayake ... Udaya Wijenayake
Proceedings of Conference on Transdisciplinary Research in Engineering | VOL. 1
Prabodha Y Abeynayake, et. al. Prabodha Y Abeynayake ... Udaya Wijenayake
02 May 2024
Proceedings of Conference on Transdisciplinary Research in Engineering | VOL. 1

You Look Suspicious!!: Leveraging Visible Attributes to Classify Malicious Short URLs on Twitter
Raj Kumar Nepali ... Yong Wang
-
Raj Kumar Nepali, et. al.Raj Kumar Nepali ... Yong Wang
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Malicious Domain Names Detection Algorithm Based on Lexical Analysis and Feature Quantification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access