Exploring the Performance of Farasa and CAMeL Taggers for Arabic Dialect Tweets

Areej Alshutayri,Shahd Alahda,Aseel Alfaidi,Hajer Alwadei

doi:10.34028/iajit/20/3/7

Abstract

In Natural Language Processing (NLP), Part Of Speech (POS) tagging is an important step; it is a fundamental requirement for many applications, such as information extraction, machine translation, and grammar checking. Successful POS taggers have been developed for many languages, including Arabic. Currently, the spread of social media has increased the diversity of dialects as people use them in their online communications. Therefore, it has become more difficult for researchers to classify some words that are understood by humans but not computers. In addition, most Arabic POS research focuses on Modern Standard Arabic (MSA), while Dialect Arabic (DA) receives less attention. This paper aims to evaluate the performance of two Arabic taggers when used on dialect Arabic tweets and determine which tagger is the appropriate one, which will accordingly help to improve the existent taggers for dialect Arabic tweets. We used the Farasa and CAMeL taggers, which are commonly used to analyze Arabic texts and are considered the best taggers for Arabic. The results indicate that CAMeL tagger performed better than Farasa tagger, with accuracies of 92% and 83% respectively. In other words, a hybrid POS tagger trained with MSA and DA returns better results than the one trained on MSA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring the Performance of Farasa and CAMeL Taggers for Arabic Dialect Tweets

Abstract

Talk to us

Similar Papers

More From: The International Arab Journal of Information Technology

Lead the way for us

Journal: The International Arab Journal of Information Technology	Publication Date: Jan 1, 2023
Citations: 1

Similar Papers

Part of Speech Tagging for Setswana African Language
M.A Dibitso ... P A Owolawi
-
M.A Dibitso, et. al.M.A Dibitso ... P A Owolawi
01 Nov 2019
01 Nov 2019

Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
Dia Abuzeina ... Taqieddin Mostafa Abdalbaset
Advances in Fuzzy Systems | VOL. 2019
Dia Abuzeina, et. al.Dia Abuzeina ... Taqieddin Mostafa Abdalbaset
23 Jan 2019
Advances in Fuzzy Systems | VOL. 2019

Combination of Genetic Algorithm and Brill Tagger Algorithm for Part of Speech Tagging Bahasa Madura
Nindian Puspa Dewi ... Ubaidi Ubaidi
Proceeding of the Electrical Engineering Computer Science and Informatics | VOL. 7
Nindian Puspa Dewi, et. al.Nindian Puspa Dewi ... Ubaidi Ubaidi
01 Oct 2020
Proceeding of the Electrical Engineering Computer Science and Informatics | VOL. 7

Hidden Markov Model based Part of Speech Tagging for Nepali language
Abhijit Paul ... Bipul Syam Purkayastha
-
Abhijit Paul, et. al.Abhijit Paul ... Bipul Syam Purkayastha
01 Sep 2015
01 Sep 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring the Performance of Farasa and CAMeL Taggers for Arabic Dialect Tweets

Abstract

Talk to us

Similar Papers

More From: The International Arab Journal of Information Technology