Detecting malicious tweets in trending topics using a statistical analysis of language

Juan Martinez-Romo,Lourdes Araujo

doi:10.1016/j.eswa.2012.12.015

Juan Martinez-Romo, Lourdes Araujo

Open Access

https://doi.org/10.1016/j.eswa.2012.12.015

Copy DOI

Abstract

Twitter spam detection is a recent area of research in which most previous works had focused on the identification of malicious user accounts and honeypot-based approaches. However, in this paper we present a methodology based on two new aspects: the detection of spam tweets in isolation and without previous information of the user; and the application of a statistical analysis of language to detect spam in trending topics. Trending topics capture the emerging Internet trends and topics of discussion that are in everybody’s lips. This growing microblogging phenomenon therefore allows spammers to disseminate malicious tweets quickly and massively. In this paper we present the first work that tries to detect spam tweets in real time using language as the primary tool. We first collected and labeled a large dataset with 34K trending topics and 20million tweets. Then, we have proposed a reduced set of features hardly manipulated by spammers. In addition, we have developed a machine learning system with some orthogonal features that can be combined with other sets of features with the aim of analyzing emergent characteristics of spam in social networks. We have also conducted an extensive evaluation process that has allowed us to show how our system is able to obtain an F-measure at the same level as the best state-of-the-art systems based on the detection of spam accounts. Thus, our system can be applied to Twitter spam detection in trending topics in real time due mainly to the analysis of tweets instead of user accounts.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Expert Systems with Applications	Publication Date: Dec 20, 2012
Citations: 174	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Detecting malicious tweets in trending topics using a statistical analysis of language

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Similar Papers

Real-time trending topics detection and description from Twitter content
Amina Madani ... Omar Boussaid
Social Network Analysis and Mining | VOL. 5
Amina Madani, et. al.Amina Madani ... Omar Boussaid
05 Oct 2015
Social Network Analysis and Mining | VOL. 5

Bayesian probabilistic tensor factorization for malicious tweets in trending topics
Saini Jacob Soman ... S Murugappan
-
Saini Jacob Soman, et. al.Saini Jacob Soman ... S Murugappan
01 Jul 2014
01 Jul 2014

A Performance Evaluation of Machine Learning-Based Streaming Spam Tweets Detection
Chao Chen ... Abdulhameed Alelaiwi
IEEE Transactions on Computational Social Systems | VOL. 2
Chao Chen, et. al.Chao Chen ... Abdulhameed Alelaiwi
01 Sep 2015
IEEE Transactions on Computational Social Systems | VOL. 2

Statistical Analysis for Twitter Spam Detection
Ganesh Udge ... Mahesh Mohite
International Journal of Scientific Research in Science, Engineering and Technology | VOL. 6
Ganesh Udge, et. al.Ganesh Udge ... Mahesh Mohite
15 Apr 2019
International Journal of Scientific Research in Science, Engineering and Technology | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting malicious tweets in trending topics using a statistical analysis of language

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications