Abstract

Today, social media sites like Twitter provide effective platforms to share opinions and thoughts in public with millions of other users. These opinions shared on such sites influence a large number of people who may easily retweet them and accelerate their spread. Unfortunately, some of these opinions were expressed by extremists who promoted hateful content. Since Arabic is one of the most spoken languages, it is crucial to automate the process of monitoring Arabic content published on social sites. Therefore, this study aims to propose a hybrid technique to detect extremism in Arabic social media texts and articles to monitor the situation of published extremist content. The proposed technique combines the lexicon-based approach with the rough set theory approach. The rough set theory is employed with two approximation strategies: lower approximation and accuracy approximation. The hybrid technique used the rough set theory as a classifier and the lexicon-based as a vector. Furthermore, this study built three types of corpuses (V1, V2, and V3) collected from Twitter. The experimental findings show that among the proposed hybrid methods, the accuracy approximation was superior to the lower approximation with seed vector. It was also revealed that hybrid methods outperformed machine learning techniques in terms of efficiency. Moreover, the study recommends using an accuracy approximation method with seed vector to identify the polarity of the text.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call