A Multi-Genre Urdu Broadcast Speech Recognition System

Erbaz Khan,Farah Adeeba,Sahar Rauf,Sarmad Hussain

doi:10.1109/o-cocosda202152914.2021.9660552

Abstract

This paper reports the development of a multi-genre Urdu Broadcast (BC) corpus and a Large Vocabulary Continuous Speech Recognition (LVCSR) system. BC speech corpus of 98 hours from 453 speakers is collected and annotated. For acoustic modeling, Time-delay Neural Network (TDNN) is developed with prior Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) training and alignments. For the language model, 3-gram, 4-gram and Recurrent Neural Network (RNN) based models are developed on a text corpus of 188 million words. The developed models are tested on 4.3 hours of unseen BC multi-genre speech dataset and the best Word Error Rate (WER) 18.59% is achieved using RNN based Language Model (LM). Moreover, a detailed word error analysis is carried out to compare the errors made by humans and the Automatic Speech Recognition (ASR) System. The results showed a similar behavior of word misrecognitions by both humans and ASR.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Multi-Genre Urdu Broadcast Speech Recognition System

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Statistical Language Modeling for Automatic Speech Recognition of Agglutinative Languages
Ebru Arsoy ... Tanel Alume
-
Ebru Arsoy, et. al.Ebru Arsoy ... Tanel Alume
01 Nov 2008
01 Nov 2008

Enhancing Large Vocabulary Continuous Speech Recognition System for Urdu-English Conversational Code-Switched Speech
Muhammad Umar Farooq ... Maryam Khalid
-
Muhammad Umar Farooq, et. al.Muhammad Umar Farooq ... Maryam Khalid
05 Nov 2020
05 Nov 2020

An Investigation of Multilingual TDNN-BLSTM Acoustic Modeling for Hindi Speech Recognition
Ankit Kumar ... Rajesh Kumar Aggarwal
International Journal of Sensors, Wireless Communications and Control | VOL. 12
Ankit Kumar, et. al.Ankit Kumar ... Rajesh Kumar Aggarwal
01 Jan 2021
International Journal of Sensors, Wireless Communications and Control | VOL. 12

Bangladeshi Bangla speech corpus for automatic speech recognition research
Shafkat Kibria ... M Zafar Iqbal
Speech Communication | VOL. 136
Shafkat Kibria, et. al.Shafkat Kibria ... M Zafar Iqbal
10 Dec 2021
Speech Communication | VOL. 136

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Multi-Genre Urdu Broadcast Speech Recognition System

Abstract

Talk to us

Similar Papers