Enhancing Detection of Malicious URLs Using Boosting and Lexical Features

Mohammad Atrees,Ashraf Ahmad,Firas Alghanim

doi:10.32604/iasc.2022.020229

Mohammad Atrees, Ashraf Ahmad + Show 1 more

Open Access

https://doi.org/10.32604/iasc.2022.020229

Copy DOI

Abstract

A malicious URL is a link that is created to spread spams, phishing, malware, ransomware, spyware, etc. A user may download malware that can adversely affect the computer by clicking on an infected URL, or might be convinced to provide confidential information to a fraudulent website causing serious losses. These threats must be identified and handled in a decent time and in an effective way. Detection is traditionally done through the blacklist usage method, which relies on keyword matching with previously known malicious domain names stored in a repository. This method is fast and easy to implement, with the advantage of having low false-positive rates regarding previously recognized malicious URLs. However, this method cannot recognize newly created malicious URLs. To solve this problem, many machine-learning models have been used. In this paper, we introduce an effective machine learning approach that uses an ensemble learner algorithm called AdaBoost (Adaptive Boosting), combined with different algorithms that enhance detection. For datasets filtration, we used CfsSubsetEval technique, which is an algorithm that searches for a subset of features that work well together. Datasets were collected from the UNB repository; divided into four categories: spam, phishing, malware, and defacement URLs; combined with benign URLs, dataset content is based on lexical features. The experimental results indicate that the proposed approach was successful in enhancing the detection accuracy of malicious URLs with less false-positive rates for all experimental algorithms.

Highlights

We introduce an effective machine learning approach that uses an ensemble learner algorithm called AdaBoost (Adaptive Boosting), combined with different algorithms that enhance detection
The results summarized in Tab. 6 show the precision, recall and accuracy performance of the Support Vector Machines (SVM) classifier on all datasets, before and after applying AdaBoost
We have discussed in this paper the challenges of detecting malicious URLs content using traditional methods, and how machine learning helped to address these challenges by providing effective models that capture a larger distribution of malicious URLs

Summary

Introduction

A user could be manipulated to provide sensitive information to a phishing webpage voluntarily, or become a victim of a drive-by-download, ending with a malware infection [3,4]. Various types of malicious URLs exist, the most popular are phishing, spam, malware (drive-by download), and defacement URLs. Phishing websites are sites that seek to steal users' private and sensitive information, such as bank card numbers, or user credentials. Phishing websites are sites that seek to steal users' private and sensitive information, such as bank card numbers, or user credentials This is usually done by deceiving the users into thinking they are on a legitimate website.

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Intelligent Automation & Soft Computing	Publication Date: Jan 1, 2022
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Enhancing Detection of Malicious URLs Using Boosting and Lexical Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation & Soft Computing

Lead the way for us

Similar Papers

Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification
...
-
, et. al. ...
20 Mar 2018
20 Mar 2018

A Hybrid approach combining blocklists, machine learning and deep learning for detection of malicious URLs
Bronjon Gogoi ... Arabinda Dutta
-
Bronjon Gogoi, et. al.Bronjon Gogoi ... Arabinda Dutta
15 Jul 2022
15 Jul 2022

Malicious and Benign URL Dataset Generation Using Character-Level LSTM Models
Spencer Vecile ... Katarina Grolinger
-
Spencer Vecile, et. al.Spencer Vecile ... Katarina Grolinger
22 Jun 2022
22 Jun 2022

On Phishing: URL Lexical and Network Traffic Features Analysis and Knowledge Extraction using Machine Learning Algorithms (A Comparison Study)
Wesam Fadheel ... Wassnaa Al-Mawee
-
Wesam Fadheel, et. al.Wesam Fadheel ... Wassnaa Al-Mawee
22 Jul 2022
22 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Detection of Malicious URLs Using Boosting and Lexical Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Intelligent Automation &amp; Soft Computing

More From: Intelligent Automation & Soft Computing