Abstract

Social networking sites such as Twitter, Facebook, Weibo, etc. are extremely mainstream today. Also, the greater part of the malicious users utilize these sites to persuade legitimate users for different purposes, for example, to promote their products item, to enter their spam links, to stigmatize other persons, etc. An ever increasing number of users utilize these social networking sites and fake accounts on these destinations are turned into a major issue. In this paper, fake accounts are detected using blacklist instead of traditional spam words list. Blacklist is created using topic modeling approach and keyword extraction approach. We evaluate our blacklist based approach on 1KS-10KN dataset and Social Honeypot dataset and compared the accuracy with the traditional spam words list based approach. Diverse ensemble creation by oppositional relabeling of artificial training examples, a meta-learner classifier is applied for classifying fake accounts on Twitter from legitimate accounts. Our approach achieves 95.4% accuracy and true positive rate is 0.95.

Highlights

  • People use Twitter to share their feelings, news, events and to post their daily activities such as eating, drinking, travelling, etc

  • Malicious users can check everyone’s activities from their timeline and Twitter becomes a place for malicious users to commit the crimes

  • Some of the researchers detected fake accounts based on content of the tweets

Read more

Summary

INTRODUCTION

People use Twitter to share their feelings, news, events and to post their daily activities such as eating, drinking, travelling, etc. Malicious users can check everyone’s activities from their timeline and Twitter becomes a place for malicious users to commit the crimes These malicious users create fake accounts and spread various fake news, links and photos. Fake accounts detection on Twitter is necessary for everyone who uses the social networking sites. Many former researchers detected fake accounts on Twitter using various features. Some of the researchers detected fake accounts based on content of the tweets. Other researchers detected using both content and profile features. The aim of the approach is to detect fake account on Twitter based on the content of tweet. The major contribution of this approach is to create a blacklist that can effectively extract the fake features from the fake accounts.

RELATED WORK
BLACKLIST CREATION
Preprocessing
Topic Extraction
Keyword Extraction
FAKE ACCOUNTS DETECTION ON TWITTER
Features Extraction
Number of tweets
Number of URLs
Number of hashtags
Number of mentions
Number of fake words
Fake words ratio
URLs ratio
Hashtags ratio
4.1.10. Mention ratio
4.1.11. Total number of words
Classification
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call