Phonetically rich and balanced speech corpus for Arabic speaker-independent continuous automatic speech recognition systems

Mohammad A M Abushariah,Moustafa Elshafei,Othman O Khalifa,Roziati Zainuddin,Raja N Ainon

doi:10.1109/isspa.2010.5605554

Abstract

This paper describes an efficient framework for designing and developing Arabic speaker-independent continuous automatic speech recognition systems based on a phonetically rich and balanced speech corpus. The speech corpus contains 415 sentences recorded by 42 (21 male and 21 female) Arabic native speakers from 11 Arab countries representing three major regions (Levant, Gulf, and Africa). The developed system is based on the Carnegie Mellon University (CMU) Sphinx tools. The Cambridge HTK tools were also used in some testing stages. The speech engine uses 3-emitting state Hidden Markov Models (HMM) for tri-phone based acoustic models. Based on experimental analysis of 4.07 hours of training speech data, the acoustic model used continuous observation's probability model of 16 Gaussian mixture distributions and the state distributions were tied to 400 senons. The language model contains both bi-grams and tri-grams. The system obtained 91.23% and 92.54% correct word recognition with and without diacritical marks respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Phonetically rich and balanced speech corpus for Arabic speaker-independent continuous automatic speech recognition systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Acoustic training system for speaker independent continuous Arabic speech recognition system
M Nofal ... N.A Kader
-
M Nofal, et. al.M Nofal ... N.A Kader
18 Dec 2004
18 Dec 2004

Arabic Dialectical Speech Recognition in Mobile Communication Services
Qiru Zhou ... Imed Zitouni
-
Qiru Zhou, et. al.Qiru Zhou ... Imed Zitouni
01 Nov 2008
01 Nov 2008

Broad class network generation using a combination of rules and statistics for speaker independent continuous speech
B Chigier ... R.A Brennan
-
B Chigier, et. al.B Chigier ... R.A Brennan
11 Apr 1988
11 Apr 1988

Phonetically rich and balanced text and speech corpora for Arabic language
Mohammad A M Abushariah ... Raja N Ainon
Language Resources and Evaluation | VOL. 46
Mohammad A M Abushariah, et. al.Mohammad A M Abushariah ... Raja N Ainon
05 Nov 2011
Language Resources and Evaluation | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phonetically rich and balanced speech corpus for Arabic speaker-independent continuous automatic speech recognition systems

Abstract

Talk to us

Similar Papers