Malicious and benign websites classification using machine learning methods

M Lavreniuk,O Novikov

doi:10.20535/tacs.2664-29132020.1.209434

Abstract

Nowadays web surfing is an integral part of the life of the average person and everyone would like to protect his own data from thieves and malicious web pages. Therefore, this paper proposes a solution to the discrimination of malicious and benign websites problem with desirable accuracy. We propose to utilize machine learning methods for classification malicious and benign websites based on URL and other host-based features. State-of-the-art gradient-boosted decision trees are proposed to use for this task and they have been compared with well-known machine learning methods as random forest and multilayer perceptron. It was shown that all machine learning methods provided desirable accuracy which is higher than 95% for solving this problem and proposed gradient-boosted decision trees outperforms random forest and neural network approach in this case in terms of both overall accuracy and f1-score.

Highlights

Introduction sourceLocator (URL) detection [5], [6]
In 2010, the population of Internet users this paper, we propose machine learning methods for was about two billion [1] and at the end of June 2019, classification websites on malicious and benign based the population of Internet users reached more than 4.5 on URL itself and utilizing additional billion [2]
WHOIS_STATEPRO: it is a categorical variable, machine learning techniques for malicious Uniform Reits values are the states we got from the server

Summary

Introduction

Introduction sourceLocator (URL) detection [5], [6]. For example, in [6], [7] the authors used only URL information forThe popularity of the Internet grows every year and features extraction by machine learning approaches. In 2010, the population of Internet users this paper, we propose machine learning methods for was about two billion [1] and at the end of June 2019, classification websites on malicious and benign based the population of Internet users reached more than 4.5 on URL itself and utilizing additional billion [2]. These standard approaches its values are the countries we got from the server have issues in case of observing new attacks due to response

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Malicious and benign websites classification using machine learning methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theoretical and Applied Cybersecurity

Lead the way for us

Journal: Theoretical and Applied Cybersecurity	Publication Date: Aug 6, 2020
License type: cc-by

Similar Papers

How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach
Daisuke Ichikawa ... Hiroshi Oyama
Journal of Biomedical Informatics | VOL. 64
Daisuke Ichikawa, et. al.Daisuke Ichikawa ... Hiroshi Oyama
19 Sep 2016
Journal of Biomedical Informatics | VOL. 64

Can natural language processing help differentiate inflammatory intestinal diseases in China? Models applying random forest and convolutional neural network approaches
Yuanren Tong ... Yue Li
BMC Medical Informatics and Decision Making | VOL. 20
Yuanren Tong, et. al.Yuanren Tong ... Yue Li
29 Sep 2020
BMC Medical Informatics and Decision Making | VOL. 20

Deep-Learning Correction Methods for Weather Research and Forecasting (WRF) Model Precipitation Forecasting: A Case Study over Zhengzhou, China
Jianbin Zhang ... Zhiqiu Gao
Atmosphere | VOL. 15
Jianbin Zhang, et. al.Jianbin Zhang ... Zhiqiu Gao
24 May 2024
Atmosphere | VOL. 15

Use of radiomics based on 18F-FDG PET/CT and machine learning methods to aid clinical decision-making in the classification of solitary pulmonary lesions: an innovative approach.
Yi Zhou ... Rong Tian
European Journal of Nuclear Medicine and Molecular Imaging | VOL. 48
Yi Zhou, et. al.Yi Zhou ... Rong Tian
05 Feb 2021
European Journal of Nuclear Medicine and Molecular Imaging | VOL. 48

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Malicious and benign websites classification using machine learning methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Theoretical and Applied Cybersecurity