Ensemble based spam detection in social IoT using probabilistic data structures

Amritpal Singh,Shalini Batra

doi:10.1016/j.future.2017.09.072

Abstract

A social approach can be used for the Internet of Things (IoT) to connect large number of objects in social networks like Twitter, Facebook, Instagram, etc. Social networks within the IoT domain have simplified the task of dynamic discovery of services and information. Detecting spam in social media, especially when massive data flows continuously and large number of attributes are associated with it, is a daunting task which requires lot of technical insight. This paper proposes a semi-supervised technique for spam detection in Twitter by employing ensemble based framework comprising of four classifiers. The framework is based on usage of Probabilistic Data Structures (PDS) like Quotient Filter (QF) to query the URL database, spam users, spam words databases and Locality Sensitive Hashing (LSH) for similarity search, as classifiers in various stages which provide fast results with less computational effort. Performance of the framework has been evaluated by comparative analysis of PDS with the similar data structures and through the standard evaluation parameters which include precision, recall and F-score.

Full Text