Abstract
With increasing technology developments, the Internet has become everywhere and accessible by everyone. There are a considerable number of web-pages with different benefits. Despite this enormous number, not all of these sites are legitimate. There are so-called phishing sites that deceive users into serving their interests. This paper dealt with this problem using machine learning algorithms in addition to employing a novel dataset that related to phishing detection, which contains 5000 legitimate web-pages and 5000 phishing ones. In order to obtain the best results, various machine learning algorithms were tested. Then J48, Random forest, and Multilayer perceptron were chosen. Different feature selection tools were employed to the dataset in order to improve the efficiency of the models. The best result of the experiment achieved by utilizing 20 features out of 48 features and applying it to Random forest algorithm. The accuracy was 98.11%.
Highlights
The Internet is everywhere today, and the society uses web services for a range of activities such as sharing knowledge, social communication, and performing various financial activities, which include buying, selling and money transferring and more other things
The aim of this paper is to present a study of existing methods used in the detection of phishing web-pages that employed the machine learning algorithms and focus on the most common feature selection methods that are used for dealing with various problems and enhance the performance and effectiveness of phishing dataset
Nowadays there is an enormous number of web pages, phishing web-pages take a significant part of them
Summary
The Internet is everywhere today, and the society uses web services for a range of activities such as sharing knowledge, social communication, and performing various financial activities, which include buying, selling and money transferring and more other things. Phishing is a conventional attack on the Internet, and it is defined as the social engineering process of luring users into fraudulent websites to obtain their personal or sensitive information such as their user names, passwords, addresses, credit card details, social security iJIM ‒ Vol 13, No 12, 2019. A recent Microsoft security intelligence (volume 24) report [2] found that phishing attacks were on the top of the discovered web attacks of 2018, and it is expected to continue increasing. The major challenge when detecting phishing attacks lies in discovering the techniques utilized. Phishers continuously enhance their strategies and can create web pages that are able to protect themselves against many forms of detection. Developing robust, effective and up to date phishing detection methods is very necessary to oppose the adaptive techniques employed by the phishers [3]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have