Phishing attack detection using gradient boosting

Sushmitha R Aslin

doi:10.26634/jdf.2.1.20840

Sushmitha R Aslin

https://doi.org/10.26634/jdf.2.1.20840

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Phishing is a prevalent cyber attack that uses deceptive websites to trick individuals into revealing personal information. These sites mimic legitimate ones to steal data such as usernames, passwords, and financial details. Detecting phishing is crucial, and machine learning algorithms are effective tools for this task. Attackers favor phishing due to its effectiveness in tricking victims with authentic-looking yet malicious links, which can breach security measures. This method employs machine learning to innovate phishing website detection. However, attackers can manipulate features like HTML, DOM, and URLs using web scraping and scripting languages. A new approach using machine learning classifiers tackles these threats by analyzing internet URLs and domain names. A dataset sourced from globally recognized intelligence services and organizations facilitates streamlined feature extraction, reducing processing overhead by prioritizing URL and domain name traits. The Gradient Boosting Classifier is used on an 11,055-instance dataset with thirty-two features to classify phishing URLs, demonstrating superior accuracy compared to methods like Random Forest. Gradient boosting is highly effective across various machine learning tasks, leveraging aggregated weak learners such as decision trees for strong predictive accuracy. Its suitability for handling imbalanced datasets makes it particularly effective for phishing detection, which is crucial for distinguishing between legitimate and malicious URLs. This method enhances accuracy by extracting and comparing distinct characteristics of legitimate and phishing URLs. By focusing on URL and domain name attributes, a more effective approach to identifying phishing attempts in cybersecurity is proposed.

Full Text