R-SALSA: A spam filtering technique for social networking sites

Mohit Agrawal,R Leela Velusamy

doi:10.1109/sceecs.2016.7509326

Abstract

Now a days, Social media is instrumental for expeditious communication among users across the globe. The escalation in the growth of Social media tools such as LinkedIn, Google+, MySpace, Pinterest, Facebook, Instagram, Twitter, Yammer, Weibo, Hyves, etc., led to rise in the volume of unsolicited messages and spamming activities in past few years. Enormous volume of spamming activities has caused severe problem in essential communication. Spam messages may be generated by automated spam bots or users. In vision of these circumstances, there has been much research effort toward doing spam filtering based on supervised approaches. Motivated by the fact, steady nature of supervised approach requires model retraining to identify new variety of spam messages. An unsupervised approach namely Reliability based Stochastic Approach for Link-Structure Analysis (R-SALSA) algorithm has been proposed in this paper for classifying a message being Spam or benign. The dataset collected from popular Netherland's social media named Hyves is used to test proposed algorithm. It has been evaluated with different performance based metrics namely true positive rate, false positive rate, accuracy, and it is found to be performing better than previously proposed unsupervised author-reporter model. The proposed algorithm achieved 9.17% accuracy in spam identification when compared with Hyper Link Induced Topic Search (HITS) and 2.49% accuracy in spam identification when compared with SALSA based method.

Full Text