CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

Dongjie Liu,Jong-Hyouk Lee

doi:10.1109/access.2020.2995157

Abstract

Although a variety of techniques to detect malicious websites have been proposed, it becomes more and more difficult for those methods to provide a satisfying result nowadays. Many malicious websites can still escape detection with various Web spam techniques. In this paper, we first summarize three types of Web spam techniques used by malicious websites, such as redirection spam, hidden IFrame spam, and content hiding spam. We then present a new detection method that adopts the perspective of users and takes screenshots of malicious webpages to invalidate Web spams. The proposed detection method uses a Convolutional Neural Network, which is a class of deep neural networks, as a classification algorithm. In order to verify the effectiveness of the method, two different experiments have been conducted. First, the proposed method was tested based on a constructed complex dataset. We present comparison results between the proposed method and representative machine learning-based detection algorithms. Second, the proposed method was tested to detect malicious websites in a real-world Web environment for three months. These experimental results illustrate that the proposed method has a better performance and is applicable to a practical Web environment.

Highlights

The Internet has become an indispensable part of people’s life
EVALUATION In order to verify the effectiveness of the method, we conducted two kinds of experiments: one is conducted on a constructed complex dataset and the other is conducted in the real-world Web environment
1) CONSTRUCTED COMPLEX DATASET To test whether the proposed method is effective and practical, a complex data set is constructed in this paper, and its complexity is demonstrated as follows:

Summary

Introduction

The Internet has become an indispensable part of people’s life. While the Internet brings prosperity, it is causing problems like illegal websites, fake medical websites, pornographic, gambling, etc. Despite the fact that various detection techniques were applied, the number of malicious websites continues to grow. The large amount of malicious information on the Internet is harmful to the health of Internet users, especially kids and teens [1], [2]. Researchers have come up with a lot of methods, including heuristic methods, machine learning based methods, and so on. Nowadays people usually use machine learning methods to analyze text and image information from websites but due to the huge temptation of profits, the malicious websites use a variety of Internet spam techniques to evade regulation.

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 53	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Growing random forest on deep convolutional neural networks for scene categorization
Shuang Bai
Expert Systems with Applications | VOL. 71
Shuang BaiShuang Bai
17 Oct 2016
Expert Systems with Applications | VOL. 71

Contact Failure Diagnosis for GIS Plug-In Connector by Magnetic Field Measurements and Deep Neural Network Classifiers
Xiangyu Guan ... Shupeng Xue
IEEE Canadian Journal of Electrical and Computer Engineering | VOL. 45
Xiangyu Guan, et. al.Xiangyu Guan ... Shupeng Xue
01 Jan 2021
IEEE Canadian Journal of Electrical and Computer Engineering | VOL. 45

Deep Learning and Regularization Algorithms for Malicious Code Classification
Haojun Wang ... Haiyan Fu
IEEE Access | VOL. 9
Haojun Wang, et. al.Haojun Wang ... Haiyan Fu
01 Jan 2020
IEEE Access | VOL. 9

Mapping integrated crop-livestock systems in Brazil with planetscope time series and deep learning
Inacio T Bueno ... Paulo S.G Magalhães
Remote Sensing of Environment | VOL. 299
Inacio T Bueno, et. al.Inacio T Bueno ... Paulo S.G Magalhães
03 Nov 2023
Remote Sensing of Environment | VOL. 299

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CNN Based Malicious Website Detection by Invalidating Multiple Web Spams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access