Mapping Linguistic Variations in Colloquial Arabic through Twitter

Abdulfattah Omar,Mohamed Elarabawy,Hamza Ethleb

doi:10.14569/ijacsa.2020.0111110

Abstract

The recent years have witnessed the development of different computational approaches to the study of linguistic variations and regional dialectology in different languages including English, German, Spanish and Chinese. These approaches have proved effective in dealing with large corpora and making reliable generalizations about the data. In Arabic, however, much of the work on regional dialectology is so far based on traditional methods; therefore, it is difficult to provide a comprehensive mapping of the dialectal variations of all the colloquial dialects of Arabic. As thus, this study is concerned with proposing a computational statistical model for mapping the linguistic variation and regional dialectology in Colloquial Arabic through Twitter based on the lexical choices of speakers. The aim is to explore the lexical patterns for generating regional dialect maps as derived from Twitter users. The study is based on a corpus of 1597348 geolocated Twitter posts. Using principal component analysis (PCA), data were classified into distinct classes and the lexical features of each class were identified. Results indicate that lexical choices of Twitter users can be usefully used for mapping the regional dialect variation in Colloquial Arabic.

Highlights

Sociolinguists have studied lexical variation and correlated the process through which speaker groups choose their vocabulary with a bundle of variables, such as gender, context, social status, topic [1,2,3,4]
It is true that these communication channels and networks provide good opportunities for researchers and sociolinguists to study and explore linguistic variation among different speaker groups
The study of linguistic variation through social media networks has been parallel to computational methods

Summary

Introduction

Sociolinguists have studied lexical variation and correlated the process through which speaker groups choose their vocabulary with a bundle of variables, such as gender, context, social status, topic [1,2,3,4]. It is true that these communication channels and networks provide good opportunities for researchers and sociolinguists to study and explore linguistic variation among different speaker groups. The study of linguistic variation through social media networks has been parallel to computational methods. This study is concerned with proposing a computational model for mapping the linguistic variation and regional dialectology in Colloquial Arabic through Twitter based on the lexical choices of speakers. In order to map the linguistic variation of Colloquial Arabic dialects, cluster analysis methods were used. This is a clustering method where each class or group has distinct features that make it different from other groups.

Objectives

Methods

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mapping Linguistic Variations in Colloquial Arabic through Twitter

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2020
License type: cc-by

Similar Papers

Effects of semantic context and regional dialect variation on speech intelligibility in noise.
Cynthia G Clopper
The Journal of the Acoustical Society of America | VOL. 125
Cynthia G ClopperCynthia G Clopper
01 Apr 2009
The Journal of the Acoustical Society of America | VOL. 125

Investigation on the Relationship between Biodiversity and Linguistic Diversity in China and Its Formation Mechanism.
Xuliang Zhang ... Hongrun Ju
International Journal of Environmental Research and Public Health | VOL. 19
Xuliang Zhang, et. al.Xuliang Zhang ... Hongrun Ju
03 May 2022
International Journal of Environmental Research and Public Health | VOL. 19

Aspects of the phonology and verb morphology of three Yemeni dialects

-

01 Jan 1989
01 Jan 1989

Understanding U.S. regional linguistic variation with Twitter data analysis
Yuan Huang ... Jack Grieve
Computers, Environment and Urban Systems | VOL. 59
Yuan Huang, et. al.Yuan Huang ... Jack Grieve
31 Dec 2015
Computers, Environment and Urban Systems | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mapping Linguistic Variations in Colloquial Arabic through Twitter

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications