Phishing Detection with Machine Learning

Pranav Habib,Karman Singh Sethi,Uday Sharma

doi:10.22214/ijraset.2022.48276

Abstract

Abstract: The goal of our project is to implement a machine learning solution to the problem of detect- ing phishing and malicious web links. The end result of our project will be a software product which uses a machine learning algorithm to detect malicious URLs. Phishing is the technique of extracting user credentials and sensitive data from users by masquerading as a genuine website. In phishing, the user is provided with a mirror website which is identical to the legitimate one but with malicious code to extract and send user credentials to phishers. Phishing attacks can lead to huge financial losses for customers of banking and financial services. The traditional approach to phishing detection has been to either to use a blacklist of known phishing links or heuristically evaluate the attributes in a suspected phishing page to detect the presence of malicious codes. The heuristic function relies on trial and error to define the threshold, which is used to classify malicious links from benign ones. The drawback to this approach is poor accuracy and low adapt- ability to new phishing links. We plan to use machine learning to overcome these drawbacks by implementing some classification algorithms and comparing the performance of these algorithms on our dataset. We will test algorithms such as Logistic Regression, SVM, Decision Trees and Neural Networks on a dataset of phishing links from UCI Machine Learning repository and pick the best model to develop a browser plugin, which can be published as a browser extension.

Full Text