An Enhanced Twitter Corpus for the Classification of Arabic Speech Acts

Majdi Ahed,Bassam H,Mohammad A

doi:10.14569/ijacsa.2020.0110325

Abstract

Twitter has gained wide attention as a major social media platform where many topics are discussed on daily basis through millions of tweets. A tweet can be viewed as a speech act (SA), which is an utterance for presenting information, hiding indirect meaning, or carrying out an action. According to SA theory, SA can represent an assertion, a question, a recommendation, or many other things. In this paper, we tackle the problem of constructing a reference corpus of Arabic tweets for the classification of Arabic speech acts. We refer to this corpus as the Arabic Tweets Speech Act Corpus (ArTSAC). It is an enhancement of a modern standard Arabic (MSA) tweet corpus of speech acts called ArSAS. ArTSAC is more advantageous than ArSAS in terms of its richness of annotated features. The goal of ArTSAC is twofold: Firstly, to understand the purpose and intention of tweets which act in accordance with the SA theory, and hence positively influencing the development of many natural language processing (NLP) applications. Secondly, as a future goal, to be used as a benchmark annotated dataset for testing and evaluating state-of-the-art Arabic SA classification algorithms and applications. ArTSAC has been put in practice to classify Arabic tweets containing speech acts using the Support Vector Machine (SVM) classification algorithm. The results of the experiments show that the enhanced ArTSAC corpus achieved an average precision of 90.6% and an F-score of 89.6%. Substantially it outperformed the results of its predecessor ArTSAC corpus.

Highlights

People discuss different issues and topics on twitter throughout their tweets
Before we discuss the results we obtained from our modified Arabic Tweets Speech Act Corpus (ArTSAC) corpus, we start with highlighting the previous results obtained from the Arabic SA and Sentiment corpus (ArSAS) corpus [33] we compare the results from running Support Vector Machine (SVM) on our modified ArTSAC corpus and compare it with the ArSAS corpus
We presented the development and construction of a richly annotated reference corpus of Arabic tweets for speech act classifications

Summary

INTRODUCTION

People discuss different issues and topics on twitter throughout their tweets. Recently, twitter has gained great attention and attraction from the popular press and, increasingly, from scholars. Due to the tremendous volume of tweets, the problem of classifying and extracting useful information out of them is a sort of managing big data. 3) NLP tasks such as sentiment analysis [10], rumor detection [11], and evaluation of customer satisfaction are important in many online applications today; especially in big data environments where the need for automated tools is urgent. We tackle the problem of creating a reference corpus of Arabic tweets for the classification of Arabic speech acts. The goal of ArTSAC is twofold: Firstly, to understand the purpose and intention of people’s tweets which comply with the SA theory, and positively influencing the development of many Arabic NLP applications.

RELATED WORKS

PROPOSED ARABIC TWEETS ACT CLASSIFIER

Construction of the ArTSAC Reference Corpus

Support Vector Machine

Evaluation of ArTSAC

Findings

CONCLUSIONS

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2020
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

An Enhanced Twitter Corpus for the Classification of Arabic Speech Acts

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

The learning performance of support vector machine classification based on Markov sampling
Bin Zou ... Zhiming Peng
Science China Information Sciences | VOL. 56
Bin Zou, et. al.Bin Zou ... Zhiming Peng
07 Sep 2011
Science China Information Sciences | VOL. 56

Translation russian cyrillic to latin alphabet using SVM (support vector machine)
Dian Faruqi Azid ... Budhi Irawan
-
Dian Faruqi Azid, et. al.Dian Faruqi Azid ... Budhi Irawan
01 Nov 2017
01 Nov 2017

Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points
Zhongwei Sun ... Zhongwen Guo
-
Zhongwei Sun, et. al.Zhongwei Sun ... Zhongwen Guo
01 Jan 2015
01 Jan 2015

Sentiment Analysis of Product Reviews using Support Vector Machine Learning Algorithm
Esha Tyagi ... Arvind Kumar Sharma
Indian Journal of Science and Technology | VOL. 10
Esha Tyagi, et. al.Esha Tyagi ... Arvind Kumar Sharma
01 Jun 2017
Indian Journal of Science and Technology | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Enhanced Twitter Corpus for the Classification of Arabic Speech Acts

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications