Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection

Sang Min Lee ,Dong-Seong Kim ,Jong Sou Park

doi:10.3217/jucs-017-06-0944

Sang Min Lee , Dong-Seong Kim + Show 1 more

https://doi.org/10.3217/jucs-017-06-0944

Copy DOI

Abstract

E-mail spam is no more garbage but risk since it recently includes virus attachments and spyware agents which make the recipients' system ruined, therefore, there is an emerging need for spam detection. Many spam detection techniques based on machine learning techniques have been proposed. As the amount of spam has been increased tremendously using bulk mailing tools, spam detection techniques should counteract with it. To cope with this, parameters optimization and feature selection have been used to reduce processing overheads while guaranteeing high detection rates. However, previous approaches have not taken into account feature variable importance and optimal number of features. Moreover, to the best of our knowledge, there is no approach which uses both parameters optimization and feature selection together for spam detection. In this paper, we propose a spam detection model enabling both parameters optimization and optimal feature selection; we optimize two parameters of detection models using Random Forests (RF) so as to maximize the detection rates. We provide the variable importance of each feature so that it is easy to eliminate the irrelevant features. Furthermore, we decide an optimal number of selected features using two methods; (i) only one parameters optimization during overall feature selection and (ii) parameters optimization in every feature elimination phase. Finally, we evaluate our spam detection model with cost-sensitive measures to avoid misclassification of legitimate messages, since the cost of classifying a legitimate message as a spam far outweighs the cost of classifying a spam as a legitimate message. We perform experiments on Spambase dataset and show the feasibility of our approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Universal Computer Science	Publication Date: Jan 1, 2011
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection

Abstract

Talk to us

Similar Papers

More From: Journal of Universal Computer Science

Lead the way for us

Similar Papers

Spam Detection Using Feature Selection and Parameters Optimization
Sang Min Lee ... Jong Sou Park
-
Sang Min Lee, et. al.Sang Min Lee ... Jong Sou Park
01 Feb 2010
01 Feb 2010

Enhancement of email spam detection using improved deep learning algorithms for cyber security
Kadam Vikas Samarthrao ... Vandana M Rohokale
Journal of Computer Security | VOL. 30
Kadam Vikas Samarthrao, et. al.Kadam Vikas Samarthrao ... Vandana M Rohokale
02 Mar 2022
Journal of Computer Security | VOL. 30

Bio-inspired Algorithms in Software Fault Prediction: A Systematic Literature Review
Asad Ali ... Carmine Gravino
-
Asad Ali, et. al.Asad Ali ... Carmine Gravino
16 Dec 2020
16 Dec 2020

A Classification of Sleep Disorders with Optimal Features Using Machine Learning Techniques
...
Journal of Health Research | VOL. 31
, et. al. ...
01 Jan 2017
Journal of Health Research | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection

Abstract

Talk to us

Similar Papers

More From: Journal of Universal Computer Science