A three-stage neural model for Arabic Dialect Identification

Abdelmajeed Mohammed,Zheng Jiangbin,Ahmed Murtadha

doi:10.1016/j.csl.2023.101488

Abstract

The Arabic language has several dialects across the twenty-two Arabic-speaking countries in Asia and Africa. Arabic Dialect Identification (ADI) is still a challenging task due to the well-recognized complexity and variations of Arabic dialects. It is noteworthy that Arabic dialects share the majority of tokens. The state-of-the-art solutions have been built upon various machine learning approaches. However, they commonly treat all words equally-likely and thus ignores the importance of dialectal words in response to a given dialect. In this paper, we propose a three-stage neural approach to learn the dialectal semantic representation from a given corpus. Specifically, we first aim to capture the dialect-relevant information, which is then used to model the dialectal vector representation. The goal is to filter away the shared words between dialects to reduce the noisy information fused to the fully connected layer. We introduce two variants, including LSTM-based and Transformer-based. Finally, we empirically evaluate the performance of the proposed solution by a comparative study on real benchmark datasets, including MADAR, NADI, and QADI. Our extensive experiments show that it consistently achieves state-of-the-art performance. Due to the well-recognized challenging of ADI, the improvement margins can be deemed considerable. The code is available on GitHub.11The code is available: https://github.com/amurtadha/arabic-dialect-identification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A three-stage neural model for Arabic Dialect Identification

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Journal: Computer Speech & Language	Publication Date: Jan 31, 2023
Citations: 5

Similar Papers

Mawdoo3 AI at MADAR Shared Task: Arabic Tweet Dialect Identification
Bashar Talafha ... Hussein Al-Natsheh
-
Bashar Talafha, et. al.Bashar Talafha ... Hussein Al-Natsheh
01 Jan 2019
01 Jan 2019

On the Robustness of Arabic Speech Dialect Identification
...
arXiv (Cornell University) | VOL. -
, et. al. ...
01 Jun 2023
arXiv (Cornell University) | VOL. -

The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech
Ahmed Ali ... Steve Renals
-
Ahmed Ali, et. al.Ahmed Ali ... Steve Renals
01 Dec 2019
01 Dec 2019

Spoken Arabic dialect recognition using X-vectors
Abualsoud Hanani ... Rabee Naser
Natural Language Engineering | VOL. 26
Abualsoud Hanani, et. al.Abualsoud Hanani ... Rabee Naser
04 May 2020
Natural Language Engineering | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A three-stage neural model for Arabic Dialect Identification

Abstract

Talk to us

Similar Papers

More From: Computer Speech &amp; Language

More From: Computer Speech & Language