Phishing Image Spam Classification Research Trends: Survey and Open Issues

Ovye John Abari,Fatimah Khalid,Nor Fazlida,Mohd Yunus,Noor Afiza

doi:10.14569/ijacsa.2020.0111196

Ovye John Abari, Fatimah Khalid + Show 3 more

Open Access

https://doi.org/10.14569/ijacsa.2020.0111196

Copy DOI

Abstract

A phishing email is an attack that focused com-pletely on people to circumvent existing traditional security algorithms. The email appears to be a dependable, appropriate, and solid communication medium for internet users. At present, the email is submerged with spam content, both in text-based form or undesired text planted inside the images. This study reviews articles on phishing image spam classification published from 2006 to 2020 based on spam classification application domains, datasets, features sets, spam classification methods, and the measurement metrics adopted in the existing studies. More than 50 articles, both from Web of Science and Scopus databases were picked. Achieving the study’s target, we carried out a broad survey and analysis to identify the domains where spam classification was applied. Furthermore, several public data sets, features set, classification methods, and measuring metrics are found and the popular once were pinpointed. The study revealed that Personal Collection, Dredze, and Spam Archives datasets are the most commonly used datasets in image spam classification research. Low-level and image metadata are the most widely used features set. The methods of image spam classification as identified in this study are supervised machine learning, unsu-pervised machine learning, semi-supervised machine learning, content-based and statistical learning. Among these methods, the most commonly utilized is the Support Vector Machine (SVM) which falls under supervised machine learning. This is followed by Na¨ive Bayes and K-Nearest Neighbor. The commonly adopted metrics for the performance evaluation of the existing image spam classifiers are also identified and briefly discussed. We compared the performance of the state-of-the-art image spam models. Lastly, we pointed out promising directions for future research.

Highlights

Phishing is a social engineering attack against people in a helpless society by controlling human beings into giving their confidential information to the cheats, called phishers
This study provides a thorough overview of image spam classification studies to help researchers in this field in gaining excellent knowledge and understanding of current image spam classification solutions in the major areas
The selected papers were analyzed from five dimensions of rationality: spam classification application domains, datasets adopted and features sets utilized in the two application domains, the methods used, and the matrices considered for the performance evaluation

Summary

INTRODUCTION

Phishing is a social engineering attack against people in a helpless society by controlling human beings into giving their confidential information to the cheats, called phishers. There are different types of techniques used in classifying image spam as shown in Fig. 4 [3] These are grouped into Supervised Machine Learning, Unsupervised Machine Learning, Semi-supervised Machine Learning, Content-based Learning, and Statistical Learning. Numerous researchers utilized these approaches for phishing email classification and detection. The objective of image spam is clearly to bypass the investigation of the content of text-based email performed by the existing spam algorithms. For this reason, spammers usually include some bogus text to the email together with the attached image such as a length of words that are persuasive or cogent to surface in genuine emails and not in spam [10].

RELATED WORKS

Identification of Spam Classification Application Areas

Spam Classification Dataset Analysis and Review

Feature Set Analysis and Review

Spam Classification Techniques Analysis and Review

Results

Ref Method

Performance Metrics Review and Analysis

24. Distributed

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2020
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Phishing Image Spam Classification Research Trends: Survey and Open Issues

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Artificial Intelligence and Machine Learning: What You Always Wanted to Know but Were Afraid to Ask
Puru Rattan ... Daniel D Penrice
Gastro hep advances | VOL. 1
Puru Rattan, et. al.Puru Rattan ... Daniel D Penrice
01 Jan 2021
Gastro hep advances | VOL. 1

Machine learning in pain research.
Jörn Lötsch ... Alfred Ultsch
PAIN | VOL. 159
Jörn Lötsch, et. al.Jörn Lötsch ... Alfred Ultsch
24 Nov 2017
PAIN | VOL. 159

Image Spam Classification Using Neural Network
Mozammel Chowdhury ... Morshed Chowdhury
-
Mozammel Chowdhury, et. al.Mozammel Chowdhury ... Morshed Chowdhury
01 Jan 2015
01 Jan 2015

An Improved Image Spam Classification Model Based on Deep Learning Techniques
A. Buboo Singh ... Vincenzo Conti
Security and Communication Networks | VOL. 2022
A. Buboo Singh, et. al.A. Buboo Singh ... Vincenzo Conti
02 Aug 2022
Security and Communication Networks | VOL. 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phishing Image Spam Classification Research Trends: Survey and Open Issues

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications