Spam filtering for short messages in adversarial environment

Patrick P.K Chan,Cheng Yang,Daniel S Yeung,Wing W.Y Ng

doi:10.1016/j.neucom.2014.12.034

Abstract

The unsolicited bulk messages are widespread in the applications of short messages. Although the existing spam filters have satisfying performance, they are facing the challenge of an adversary who misleads the spam filters by manipulating samples. Until now, the vulnerability of spam filtering technique for short messages has not been investigated. Different from the other spam applications, a short message only has a few words and its length usually has an upper limit. The current adversarial learning algorithms may not work efficiently in short message spam filtering. In this paper, we investigate the existing good word attack and its counterattack method, i.e. the feature reweighting, in short message spam filtering in an effort to understand whether, and to what extent, they can work efficiently when the length of a message is limited. This paper proposes a good word attack strategy which maximizes the influence to a classifier with the least number of inserted characters based on the weight values and also the length of words. On the other hand, we also proposes the feature reweighting method with a new rescaling function which minimizes the importance of the feature representing a short word in order to require more inserted characters for a successful evasion. The methods are evaluated experimentally by using the SMS and the comment spam dataset. The results confirm that the length of words is a critical factor of the robustness of short message spam filtering to good word attack.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spam filtering for short messages in adversarial environment

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Dec 24, 2014
Citations: 61

Similar Papers

Spam filtering for short messages
Gordon V Cormack ... Enrique Puertas Sánz
-
Gordon V Cormack, et. al.Gordon V Cormack ... Enrique Puertas Sánz
06 Nov 2007
06 Nov 2007

Countering Good Word Attacks on Statistical Spam Filters with Instance Differentiation and Multiple Instance Learning
Yan Zhou ... Meador Inge
-
Yan Zhou, et. al.Yan Zhou ... Meador Inge
01 Aug 2008
01 Aug 2008

Short Messages Spam Filtering Using Sentiment Analysis
Enaitz Ezpeleta ... José María Gómez Hidalgo
-
Enaitz Ezpeleta, et. al.Enaitz Ezpeleta ... José María Gómez Hidalgo
01 Jan 2015
01 Jan 2015

A Vector Space Model based spam SMS filter
Wei Li ... Sisheng Zeng
-
Wei Li, et. al.Wei Li ... Sisheng Zeng
01 Aug 2016
01 Aug 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spam filtering for short messages in adversarial environment

Abstract

Talk to us

Similar Papers

More From: Neurocomputing