HELPHED: Hybrid Ensemble Learning PHishing Email Detection

Panagiotis Bountakas,Christos Xenakis

doi:10.1016/j.jnca.2022.103545

Abstract

Phishing email attack is a dominant cyber-criminal strategy for decades. Despite its longevity, it has evolved during the COVID-19 pandemic, indicating that adversaries exploit critical situations to lure victims. Plenty of detectors have been proposed over the years, which mainly focus on the contents or the textual information of emails; however, to cope with the evolution of phishing emails more sophisticated approaches should be introduced that will exploit all the emails’ traits to enhance the detection capability of Machine Learning/Deep Learning classifiers. To tackle the limitations of existing works, this paper proposes a phishing email detection methodology, named HELPHED that focuses on the detection of phishing emails by combining Ensemble Learning methods with hybrid features. The hybrid features provide an accurate representation of emails by fusing their content and textual traits. We propose two methods of HELPHED, the first one employs the Stacking Ensemble Learning method, while the second method utilizes the Soft Voting Ensemble Learning. Both methods deploy two different Machine Learning algorithms to handle the hybrid features separately, yet in parallel, minimizing the features’ complexity and improving the model’s performance. A thorough evaluation analysis is carried out considering innovative guidelines that aim to prevent partial and misleading results. Experimental tests verified that the combination of hybrid features with Ensemble Learning, overall, accomplishes better detection performance than when employing only content-based or text-based features. Numerical results on a rich imbalanced dataset (i.e., 32,051 benign and 3,460 phishing email samples) that considers the evolution of phishing emails show that Soft Voting Ensemble Learning outperforms other prominent Machine Learning/Deep Learning algorithms and existing works yielding F1-score equal to 0.9942.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

HELPHED: Hybrid Ensemble Learning PHishing Email Detection

Abstract

Talk to us

Similar Papers

More From: Journal of Network and Computer Applications

Lead the way for us

Journal: Journal of Network and Computer Applications	Publication Date: Nov 21, 2022
Citations: 28

Similar Papers

A Systematic Literature Review on Phishing Email Detection Using Natural Language Processing Techniques
Said Salloum ... Sunil Vadera
IEEE Access | VOL. 10
Said Salloum, et. al.Said Salloum ... Sunil Vadera
01 Jan 2021
IEEE Access | VOL. 10

Phishing Email Detection Based on Hybrid Features
Zhuorao Yang ... Chen Qiao
IOP Conference Series: Earth and Environmental Science | VOL. 252
Zhuorao Yang, et. al.Zhuorao Yang ... Chen Qiao
01 Apr 2019
IOP Conference Series: Earth and Environmental Science | VOL. 252

Detecting Phishing Emails Using Hybrid Features
Liping Ma ... Paul Watters
-
Liping Ma, et. al.Liping Ma ... Paul Watters
01 Jan 2009
01 Jan 2009

Phishing email detection technique by using hybrid features
Lew May Form ... San Nah Sze
-
Lew May Form, et. al. Lew May Form ... San Nah Sze
01 Aug 2015
01 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

HELPHED: Hybrid Ensemble Learning PHishing Email Detection

Abstract

Talk to us

Similar Papers

More From: Journal of Network and Computer Applications