Phishing Websites Detection using Machine Learning

Arun Kulkarni,Leonard L

doi:10.14569/ijacsa.2019.0100702

Abstract

Tremendous resources are spent by organizations guarding against and recovering from cybersecurity attacks by online hackers who gain access to sensitive and valuable user data. Many cyber infiltrations are accomplished through phishing attacks where users are tricked into interacting with web pages that appear to be legitimate. In order to successfully fool a human user, these pages are designed to look like legitimate ones. Since humans are so susceptible to being tricked, automated methods of differentiating between phishing websites and their authentic counterparts are needed as an extra line of defense. The aim of this research is to develop these methods of defense utilizing various approaches to categorize websites. Specifically, we have developed a system that uses machine learning techniques to classify websites based on their URL. We used four classifiers: the decision tree, Naïve Bayesian classifier, support vector machine (SVM), and neural network. The classifiers were tested with a data set containing 1,353 real world URLs where each could be categorized as a legitimate site, suspicious site, or phishing site. The results of the experiments show that the classifiers were successful in distinguishing real websites from fake ones over 90% of the time.

Highlights

While cybersecurity attacks continue to escalate in both scale and sophistication, social engineering approaches are still some of the simplest and most effective ways to gain access to sensitive or confidential information
One general approach to recognizing illegitimate phishing websites relies on their Uniform Resource Locators (URLs)
A URL is a global address of a document in the World Wide Web, and it serves as the primary means to locate a document on the Internet

Summary

Introduction

While cybersecurity attacks continue to escalate in both scale and sophistication, social engineering approaches are still some of the simplest and most effective ways to gain access to sensitive or confidential information. While organizations should educate employees about how to recognize phishing e-mails or links to help protect against the above types of attacks, software such as HTTrack is readily available for users to duplicate entire websites for their own purposes. The above problem implies that computer-based solutions for guarding against phishing attacks are needed along with user education. Such a solution would enable a computer to have the ability to identify malicious websites in order to prevent users from interacting with them. One general approach to recognizing illegitimate phishing websites relies on their Uniform Resource Locators (URLs). Even in cases where the content of websites are duplicated, the URLs could still be used to distinguish real sites from imposters

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2019
Citations: 35	License type: cc-by

R Discovery Prime

R Discovery Prime

Phishing Websites Detection using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Entropy-Based Feature Selection Classification Approach for Detecting Phishing Websites
Shahzad Ali ... Muhammad Shahbaz
-
Shahzad Ali, et. al.Shahzad Ali ... Muhammad Shahbaz
01 Dec 2019
01 Dec 2019

Trust evaluation of health websites by eliminating phishing websites and using similarity techniques
Sarika Gupta ... Himani Bansal
Concurrency and Computation: Practice and Experience | VOL. 35
Sarika Gupta, et. al.Sarika Gupta ... Himani Bansal
16 Mar 2023
Concurrency and Computation: Practice and Experience | VOL. 35

A comparison of fraud cues and classification methods for fake escrow website detection
Ahmed Abbasi ... Hsinchun Chen
Information Technology and Management | VOL. 10
Ahmed Abbasi, et. al.Ahmed Abbasi ... Hsinchun Chen
21 Jul 2009
Information Technology and Management | VOL. 10

A comparative study of web pages classification methods applied to health consumer web pages
Aneeta Siddiqui ... Mehnaz Adnan
-
Aneeta Siddiqui, et. al.Aneeta Siddiqui ... Mehnaz Adnan
01 Apr 2015
01 Apr 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phishing Websites Detection using Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications