Abstract

Twitter has become a target platform for both promoters and spammers to disseminate their messages, which are more harmful than traditional spamming methods, such as email spamming. Recently, large amounts of campaigns that contain lots of spam or promotion accounts have emerged in Twitter. The campaigns cooperatively post unwanted information, and thus they can infect more normal users than individual spam or promotion accounts. Organizing or participating in campaigns has become the main technique to spread spam or promotion information in Twitter. Since traditional solutions focus on checking individual accounts or messages, efficient techniques for detecting spam and promotion campaigns in Twitter are urgently needed. In this article, we propose a framework to detect both spam and promotion campaigns. Our framework consists of three steps: the first step links accounts who post URLs for similar purposes; the second step extracts candidate campaigns that may be for spam or promotion purposes; and the third step classifies the candidate campaigns into normal, spam, and promotion groups. The key point of the framework is how to measure the similarity between accounts' purposes of posting URLs. We present two measure methods based on Shannon information theory: the first one uses the URLs posted by the users, and the second one considers both URLs and timestamps. Experimental results demonstrate that the proposed methods can extract the majority of the candidate campaigns correctly, and detect promotion and spam campaigns with high precision and recall.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.