Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media

Hissah Alsaif,Taghreed Alotaibi

doi:10.14569/ijacsa.2019.0100409

Hissah Alsaif, Taghreed Alotaibi

Open Access

PDF Available

https://doi.org/10.14569/ijacsa.2019.0100409

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

With the current increase in the number of online users, there has been a concomitant increase in the amount of data shared online. Techniques for discovering knowledge from these data can provide us with valuable information when it comes to detecting different problems, including violence. Violence is one of the significant problems humanity has faced in recent years all over the world, and this is especially a problem in Arabic countries. To address this issue, this research focuses on detecting violence-related tweets to help in solving this problem. Text mining is an important technique that can be used to find and predict information from text. In this study, a text classification model is built for detecting violence in Arabic dialects on Twitter using different feature-reduction approaches. The experiment comprises bagging, K-nearest neighbors (KNN), and Bayesian boosting using different extraction features, namely, root-based stemming, light stemming, and n-grams. In addition, the study used the following feature-reduction techniques: support vector machine (SVM), Chi-squared (CHI), the Gini index, correlation, rules, information gain (IG), deviation, symmetrical uncertainty, and the IG ratio. The experiment showed that the bagging with tri-gram approach has the highest accuracy at 86.61%, and a combination of IG with SVM from reduction features registers an accuracy of 90.59%.

Highlights

One of the most important elements in living a normal and stable life is living in peace
Due to the evolution of technologies and the increasing number of internet users around the world, especially when it comes to using social networks, social media are becoming a significant environment for studying phenomena related to violence; this is because social network users publish events rapidly, and some of them use social media sites to voice complaints and ask for help
The results showed an improvement in the classification process, where the accuracy of the classification increased from 67.25% to 82.50%.Abuhaiba et al [31] performed a study comparing the performances between a single classification algorithm and a combination of different algorithms for categorizing Arabic text

Summary

INTRODUCTION

One of the most important elements in living a normal and stable life is living in peace. Due to the evolution of technologies and the increasing number of internet users around the world, especially when it comes to using social networks, social media are becoming a significant environment for studying phenomena related to violence; this is because social network users publish events rapidly, and some of them use social media sites to voice complaints and ask for help From this perspective, this present research focus on studying violence in the Kingdom of Saudi Arabia (KSA) regarding the increased number of violence cases, where the Ministry of Labor and Social Development received more than 11,000 reports in a year [1]. This study investigated different feature-reduction methods and their effects on accuracy It used machine learning techniques, where machine learning algorithms can be categorized into two approaches— supervised and unsupervised.

ARABIC

RELATED WORK

MACHINE LEARNING

Bagging

Bayesian Boosting

Data Collection

Data Preprocessing

Feature Extraction

Feature Reduction

PERFORMANCE MEASUREMENT

EXPERIMENTAL RESULTS

Dataset Analysis

Classifier Accuracy

Baseline and Ensemble Methods Comparison

Feature-Reduction Performance

VIII. CONCLUSION

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2019
Citations: 4	License type: cc-by

R Discovery Prime

Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Using Word N-Grams as Features in Arabic Text Classification
Abdulmohsen Al-Thubaity ... Itisam Hazzaa
-
Abdulmohsen Al-Thubaity, et. al.Abdulmohsen Al-Thubaity ... Itisam Hazzaa
01 Jan 2015
01 Jan 2015

Chapter 13 - Kidney disease prediction using a machine learning approach: A comparative and comprehensive analysis
Siddhartha Kumar Arjaria ... Jincy S Cherian
Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics | VOL. -
Siddhartha Kumar Arjaria, et. al.Siddhartha Kumar Arjaria ... Jincy S Cherian
01 Jan 2020
Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics | VOL. -

Feature Selection Techniques and Classification Accuracy of Supervised Machine Learning in Text Mining
...
Journal of Information Engineering and Applications | VOL. 9
, et. al. ...
01 May 2019
Journal of Information Engineering and Applications | VOL. 9

Arabic text classification using master-slaves technique
Zinah Abdulridha Abutiheen ... Ahmed H Aliwy
Journal of Physics: Conference Series | VOL. 1032
Zinah Abdulridha Abutiheen, et. al.Zinah Abdulridha Abutiheen ... Ahmed H Aliwy
01 May 2018
Journal of Physics: Conference Series | VOL. 1032

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications