The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech

Ahmed Ali,Khalid Choukri,Suwon Shon,Younes Samih,James Glass,Hamdy Mubarak,Steve Renals

doi:10.1109/asru46091.2019.9003960

Ahmed Ali, Khalid Choukri + Show 5 more

Open Access

https://doi.org/10.1109/asru46091.2019.9003960

Copy DOI

Publication Date: Dec 1, 2019
Citations: 48	License type: other-oa

Affiliation: Qatar Airways (Qatar), University of Edinburgh

Abstract

This paper describes the fifth edition of the Multi-Genre Broadcast Challenge (MGB-5), an evaluation focused on Arabic speech recognition and dialect identification. MGB-5 extends the previous MGB-3 challenge in two ways: first it focuses on Moroccan Arabic speech recognition; second the granularity of the Arabic dialect identification task is increased from 5 dialect classes to 17, by collecting data from 17 Arabic speaking countries. Both tasks use YouTube recordings to provide a multi-genre multi-dialectal challenge in the wild. Moroccan speech transcription used about 13 hours of transcribed speech data, split across training, development, and test sets, covering 7-genres: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). The fine-grained Arabic dialect identification data was collected from known YouTube channels from 17 Arabic countries. 3,000 hours of this data was released for training, and 57 hours for development and testing. The dialect identification data was divided into three sub-categories based on the segment duration: short (under 5 s), medium (5–20 s), and long (>20 s). Overall, 25 teams registered for the challenge, and 9 teams submitted systems for the two tasks. We outline the approaches adopted in each system and summarize the evaluation results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

On the Robustness of Arabic Speech Dialect Identification
...
arXiv (Cornell University) | VOL. -
, et. al. ...
01 Jun 2023
arXiv (Cornell University) | VOL. -

Speech recognition challenge in the wild: Arabic MGB-3
Ahmed Ali ... Stephan Vogel
-
Ahmed Ali, et. al.Ahmed Ali ... Stephan Vogel
21 Sep 2017
21 Sep 2017

Mawdoo3 AI at MADAR Shared Task: Arabic Tweet Dialect Identification
Bashar Talafha ... Hussein Al-Natsheh
-
Bashar Talafha, et. al.Bashar Talafha ... Hussein Al-Natsheh
01 Jan 2019
01 Jan 2019

Spoken Arabic dialect recognition using X-vectors
Abualsoud Hanani ... Rabee Naser
Natural Language Engineering | VOL. 26
Abualsoud Hanani, et. al.Abualsoud Hanani ... Rabee Naser
04 May 2020
Natural Language Engineering | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech

Abstract

Talk to us

Similar Papers