An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL

Ali Aljofey,Qingshan Jiang,Mingqing Huang,Qiang Qu,Jean-Pierre Niyigena

doi:10.3390/electronics9091514

Abstract

Phishing is the easiest way to use cybercrime with the aim of enticing people to give accurate information such as account IDs, bank details, and passwords. This type of cyberattack is usually triggered by emails, instant messages, or phone calls. The existing anti-phishing techniques are mainly based on source code features, which require to scrape the content of web pages, and on third-party services which retard the classification process of phishing URLs. Although the machine learning techniques have lately been used to detect phishing, they require essential manual feature engineering and are not an expert at detecting emerging phishing offenses. Due to the recent rapid development of deep learning techniques, many deep learning-based methods have also been introduced to enhance the classification performance. In this paper, a fast deep learning-based solution model, which uses character-level convolutional neural network (CNN) for phishing detection based on the URL of the website, is proposed. The proposed model does not require the retrieval of target website content or the use of any third-party services. It captures information and sequential patterns of URL strings without requiring a prior knowledge about phishing, and then uses the sequential pattern features for fast classification of the actual URL. For evaluations, comparisons are provided between different traditional machine learning models and deep learning models using various feature sets such as hand-crafted, character embedding, character level TF-IDF, and character level count vectors features. According to the experiments, the proposed model achieved an accuracy of 95.02% on our dataset and an accuracy of 98.58%, 95.46%, and 95.22% on benchmark datasets which outperform the existing phishing URL models.

Highlights

Phishing is an offense in which the phisher seeks to trick users into disclosing critical and personal information such as credit card details and passwords
It is observed that Multinomial Naïve Bayes (MNB), Logistic Regression (LR), and Gaussian Naïve Bayes (GNB) classifiers have good accuracy, precision, recall, F-Score, and AUC with FG1 and FG3, whereas MNB, GNB classifiers have low accuracy, precision, F-Score, and AUC with FG2 due to the independent predictors and the normal features distribution that assumed by Naïve Bayes classifier
It has been observed that FG1 are superior to other features with regard to MNB, LR, GNB, random forest (RF), and deep neural network (DNN) classifiers, whereas FG2 are superior to other features with respect to XGB and convolutional neural network (CNN) classifiers

Summary

Introduction

Phishing is an offense in which the phisher seeks to trick users into disclosing critical and personal information such as credit card details and passwords. The intent of phishers to carry out a phishing attack is to sell the personality of the victims, to get ransom, to exploit the system’s weaknesses, or to receive financial profits [1]. One of these common offenses is to design deceptive sites which are imitations of benign websites (e.g., PayPal, eBay, etc.) and host them in a hacked domain. Phishers can use malware (i.e., malicious software), web pages, and emails to carry out phishing offenses

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Sep 15, 2020
Citations: 78	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
Harisu Abdullahi Shehu ... Sahin Uyaver
IEEE Access | VOL. 9
Harisu Abdullahi Shehu, et. al.Harisu Abdullahi Shehu ... Sahin Uyaver
01 Jan 2020
IEEE Access | VOL. 9

Preoperative Prediction of Pancreatic Neuroendocrine Neoplasms Grading Based on Enhanced Computed Tomography Imaging: Validation of Deep Learning with a Convolutional Neural Network
Yanji Luo ... Zi-Ping Li
Neuroendocrinology | VOL. 110
Yanji Luo, et. al.Yanji Luo ... Zi-Ping Li
13 Sep 2019
Neuroendocrinology | VOL. 110

Preoperative Prediction of Pancreatic Neuroendocrine Neoplasms Grading Based on Enhanced CT Imaging: Validation of Deep Learning with a Convolutional Neural Network
Yanji Luo ... Jie Chen
SSRN Electronic Journal | VOL. -
Yanji Luo, et. al.Yanji Luo ... Jie Chen
20 Feb 2019
SSRN Electronic Journal | VOL. -

Efficient mapping of crash risk at intersections with connected vehicle data and deep learning models
Jiajie Hu ... Xiong Yu
Accident Analysis & Prevention | VOL. 144
Jiajie Hu, et. al.Jiajie Hu ... Xiong Yu
16 Jul 2020
Accident Analysis & Prevention | VOL. 144

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics