Understanding phishers' strategies of mimicking uniform resource locators to leverage phishing attacks: A machine learning approach

J Samantha Tharani,Nalin A.G Arachchilage

doi:10.1002/spy2.120

Abstract

AbstractPhishing is a type of social engineering attack with an intention to steal user data, including login credentials and credit card numbers, leading to financial losses for both organizations and individuals. It occurs when an attacker, pretending as a trusted entity, lure a victim into click on a link or attachment in an email, or in a text message. Phishing is often launched via email messages or text messages over social networks. Previous research has revealed that phishing attacks can be identified just by looking at uniform resource locator (URLs). Identifying the techniques which are used by phishers to mimic a phishing URL is rather a challenging issue. At present, we have limited knowledge and understanding of how cyber‐criminals attempt to mimic URLs with the same look and feel of the legitimate ones, to entice people into clicking links. Therefore, this paper investigates the feature selection of phishing URLs (uniform resource locators), aiming to explore the strategies employed by phishers to mimic URLs that can obviously trick people into clicking links. We employed an information gain (IG) and Chi‐Squared feature selection methods in machine learning (ML) on a phishing dataset. The dataset contains a total of 48 features extracted from 5000 phishing and another 5000 legitimate URL from web pages downloaded from January to May 2015 and from May to June 2017. Our results revealed that there were 10 techniques that phishers used to mimic URLs to manipulate humans into clicking links. Identifying these phishing URL manipulation techniques would certainly help to educate individuals and organizations and keep them safe from phishing attacks. In addition, the findings of this research will also help develop anti‐phishing tools, framework or browser plugins for phishing prevention.

Full Text