Text normalization and semantic indexing to enhance Instant Messaging and SMS spam filtering

Tiago A Almeida,Tiago P Silva,Igor Santos,José M Gómez Hidalgo

doi:10.1016/j.knosys.2016.05.001

Abstract

The rapid popularization of smartphones has contributed to the growth of online Instant Messaging and SMS usage as an alternative way of communication. The increasing number of users, along with the trust they inherently have in their devices, makes such messages a propitious environment for spammers. In fact, reports clearly indicate that volume of spam over Instant Messaging and SMS is dramatically increasing year by year. It represents a challenging problem for traditional filtering methods nowadays, since such messages are usually fairly short and normally rife with slangs, idioms, symbols and acronyms that make even tokenization a difficult task. In this scenario, this paper proposes and then evaluates a method to normalize and expand original short and messy text messages in order to acquire better attributes and enhance the classification performance. The proposed text processing approach is based on lexicographic and semantic dictionaries along with state-of-the-art techniques for semantic analysis and context detection. This technique is used to normalize terms and create new attributes in order to change and expand original text samples aiming to alleviate factors that can degrade the algorithms performance, such as redundancies and inconsistencies. We have evaluated our approach with a public, real and non-encoded data-set along with several established machine learning methods. Our experiments were diligently designed to ensure statistically sound results which indicate that the proposed text processing techniques can in fact enhance Instant Messaging and SMS spam filtering.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text normalization and semantic indexing to enhance Instant Messaging and SMS spam filtering

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: May 14, 2016
Citations: 85

Similar Papers

Automated Spam Detection in Short Text Messages
Gaurav Goswami ... Richa Singh
-
Gaurav Goswami, et. al.Gaurav Goswami ... Richa Singh
02 Oct 2015
02 Oct 2015

Usage patterns of communication interfaces for social support among at-risk adolescents
David Passig
Education and Information Technologies | VOL. 19
David PassigDavid Passig
10 Mar 2013
Education and Information Technologies | VOL. 19

Contextual correlation based thread detection in short text message streams
Jiuming Huang ... Quanyuan Wu
Journal of Intelligent Information Systems | VOL. 38
Jiuming Huang, et. al.Jiuming Huang ... Quanyuan Wu
25 May 2011
Journal of Intelligent Information Systems | VOL. 38

The use of instant messaging in clinical data sharing: the EHRA SMS survey
F Guerra ... D Duncker
EP Europace | VOL. 23
F Guerra, et. al.F Guerra ... D Duncker
24 May 2021
EP Europace | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text normalization and semantic indexing to enhance Instant Messaging and SMS spam filtering

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems