Abstract

The rapidly growing online social networking sites have been infiltrated by a large amount of spam. In this paper, I focus on one of the most popular sites Twitter as an example to study the spam behaviors. To facilitate the spam detection, a directed social graph model is proposed to explore the “follower” and “friend” relationships among users. Based on Twitter’s spam policy, novel content-based features and graph-based features are also proposed. A Web crawler is developed relying on Twitter’s API methods. A spam detection prototype system is proposed to identify suspicious users on Twitter. I analyze the data set and evaluate the performance of the detection system. Classic evaluation metrics are used to compare the performance of various traditional classification methods. Experiment results show that the Bayesian classifier has the best overall performance in term of F-measure. The trained Bayesian classifier is also applied to the entire data set to distinguish the suspicious behaviors from normal ones. The result shows that the spam detection system can achieve 89% precision.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.