Abstract

Social networks have become an integral part of our daily lives. With their rapid growth, our communication using these networks has only increased as well. Twitter is one of the most popular networks in the Middle East. Similar to other social media platforms, Twitter is vulnerable to spam accounts spreading malicious content. Arab countries are among the most targeted, possibly due to the lack of effective technologies that support the Arabic language. In addition, as a complex language, Arabic has extensive grammar rules and many dialects that present challenges when extracting text data. Innovative methods to combat spam on Twitter have been the subject of many current studies. This paper addressed the issue of detecting spam accounts in Arabic on Twitter by collecting an Arabic dataset that would be suitable for spam detection. The dataset contained data from premium features by using Twitter premium API. Data labeling was conducted by flagging suspended accounts. A combined framework was proposed based on deep-learning methods with several advantages, including more accurate, faster results while demanding less computational resources. Two types of data were used, text-based data with a convolution neural networks (CNN) model and metadata with a simple neural networks model. The output of the two models combined identified accounts as spam or not spam. The results showed that the proposed framework achieved an accuracy of 94.27% with our combined model using premium feature data, and it outperformed the best models tested thus far in the literature.

Highlights

  • Online social networks (OSNs) are integral to our lives

  • Twitter premium is a subscription service with a monthly fee and is divided into categories according to the needs of the user, researcher, and even commercial companies so they can benefit from the data

  • The long short-term memory (LSTM)-combined model achieved 94% and 93.8% while DT followed with 91.11% precision and 91.05% recall, surpassing logistic regression (LR) and Support Vector Machine (SVM)

Read more

Summary

Introduction

Online social networks (OSNs) are integral to our lives. As one of the most significant online social networks [3], Twitter attracts users by offering a free “microblogging” service for posting short messages named “tweets” [4,5]. Users connect with other users by “following” them, as each account has a public following and follower count, creating the “social” aspect of the social network [7]. Searching for information of interest or the latest news via Twitter is simple: users can enter hashtags or keywords into the search function, or even just click on a hashtag they see in their feed or in the “trending” section, to review all the posts containing that keyword or hashtag [2]

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.