Improving the classification of phishing websites using a hybrid algorithm

Suvita Rani Sharma,Birmohan Singh,Manpreet Kaur

doi:10.1111/coin.12494

Abstract

AbstractIn this article, a hybrid algorithm has been proposed for the identification of phishing and legitimate websites. The dataset may have an imbalanced class distribution and may consist of irrelevant features. Therefore, in the data preprocessing, the adaptive synthetic sampling approach has been used to handle the imbalanced data. Irrelevant or redundant features are removed from the balanced data using the proposed binary version of Rao algorithms. The S‐shaped and V‐shaped transfer functions are applied for mapping continuous search space to discrete search space. Also, the results of these S‐shaped and V‐shaped transfer functions are analyzed for proposed algorithms. The performance is improved by optimizing the value of the k parameter in the kNN classifier. The dataset used in this article has been taken from the UCI machine‐learning repository. The performance of the proposed approach has been evaluated using the polygon area metric. The obtained classification accuracy is 97.044%. A comparison of the proposed hybrid algorithm with the other state‐of‐the‐art techniques is also made for validation. Moreover, the proposed approach has been compared with seven metaheuristic feature selection algorithms and six filter methods for performance analysis. Additionally, we have applied the proposed approach to URLs that are registered on the PhishTank website.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving the classification of phishing websites using a hybrid algorithm

Abstract

Talk to us

Similar Papers

More From: Computational Intelligence

Lead the way for us

Journal: Computational Intelligence	Publication Date: Nov 30, 2021
Citations: 3

Similar Papers

Detection of Heart Murmurs for Imbalanced Dataset Using Adaptive Synthetic Sampling Approach
Madhusudhan Mishra ... Anirban Mukherjee
-
Madhusudhan Mishra, et. al.Madhusudhan Mishra ... Anirban Mukherjee
01 May 2019
01 May 2019

Designing a Model to Handle Imbalance Data Classification Using SMOTE and Optimized Classifier
Shraddha Shivaji Nimankar ... Deepali Vora
-
Shraddha Shivaji Nimankar, et. al.Shraddha Shivaji Nimankar ... Deepali Vora
19 Aug 2020
19 Aug 2020

Feature selection algorithm for high dimensional biomedical data classification based on redundant removal
Bingtao Zhang ... Bin Hu
-
Bingtao Zhang, et. al.Bingtao Zhang ... Bin Hu
01 Jan 2018
01 Jan 2018

A binary Krill Herd approach based feature selection for high dimensional data
V Preeja ... A H Shahana
-
V Preeja, et. al.V Preeja ... A H Shahana
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving the classification of phishing websites using a hybrid algorithm

Abstract

Talk to us

Similar Papers

More From: Computational Intelligence