Abstract
With the current increase in the number of online users, there has been a concomitant increase in the amount of data shared online. Techniques for discovering knowledge from these data can provide us with valuable information when it comes to detecting different problems, including violence. Violence is one of the significant problems humanity has faced in recent years all over the world, and this is especially a problem in Arabic countries. To address this issue, this research focuses on detecting violence-related tweets to help in solving this problem. Text mining is an important technique that can be used to find and predict information from text. In this study, a text classification model is built for detecting violence in Arabic dialects on Twitter using different feature-reduction approaches. The experiment comprises bagging, K-nearest neighbors (KNN), and Bayesian boosting using different extraction features, namely, root-based stemming, light stemming, and n-grams. In addition, the study used the following feature-reduction techniques: support vector machine (SVM), Chi-squared (CHI), the Gini index, correlation, rules, information gain (IG), deviation, symmetrical uncertainty, and the IG ratio. The experiment showed that the bagging with tri-gram approach has the highest accuracy at 86.61%, and a combination of IG with SVM from reduction features registers an accuracy of 90.59%.
Highlights
One of the most important elements in living a normal and stable life is living in peace
Due to the evolution of technologies and the increasing number of internet users around the world, especially when it comes to using social networks, social media are becoming a significant environment for studying phenomena related to violence; this is because social network users publish events rapidly, and some of them use social media sites to voice complaints and ask for help
The results showed an improvement in the classification process, where the accuracy of the classification increased from 67.25% to 82.50%.Abuhaiba et al [31] performed a study comparing the performances between a single classification algorithm and a combination of different algorithms for categorizing Arabic text
Summary
One of the most important elements in living a normal and stable life is living in peace. Due to the evolution of technologies and the increasing number of internet users around the world, especially when it comes to using social networks, social media are becoming a significant environment for studying phenomena related to violence; this is because social network users publish events rapidly, and some of them use social media sites to voice complaints and ask for help From this perspective, this present research focus on studying violence in the Kingdom of Saudi Arabia (KSA) regarding the increased number of violence cases, where the Ministry of Labor and Social Development received more than 11,000 reports in a year [1]. This study investigated different feature-reduction methods and their effects on accuracy It used machine learning techniques, where machine learning algorithms can be categorized into two approaches— supervised and unsupervised.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have