Abstract
In this work, we develop a spontaneous large vocabulary speech recognition system for Qatari Arabic (QA). A major problem with dialectal Arabic speech recognition is due to the sparsity of speech resources. So, we propose an Automatic Speech Recognition (ASR) framework to jointly use Modern Standard Arabic (MSA) data and QA data to improve acoustic and language modeling by orthographic normalization, cross-dialectal phone mapping, data sharing, and acoustic model adaptation. A wide-band speech corpus has been developed for QA. The corpus consists of 15 hours speech data collected from different TV series and talk-show programs. The corpus was manually segmented and transcribed. A QA tri-gram language model (LM) was linearly interpolated with a large MSA LM in order to decrease Out-Of-Vocabulary (OOV) rate and to improve perplexity. The vocabulary consists of 21K words extracted from the QA training set with additional 256K MSA vocabulary. The acoustic model (AM) was trained with a data pool of QA data and additional 60 hours of MSA data. In order to boost the contribution of QA data, Maximum-A-Posteriori (MAP) adaptation was applied on the resulted AM using only the QA data, effectively increasing the weight of dialectal acoustic features in the final cross-lingual model. All trainings were performed with Maximum Mutual Information Estimation (MMIE) and with Speaker Adaptive Training (SAT) applied on top of MMIE. Our proposed approach achieves more than 16% relative reduction in WER on QA testing set compared to a baseline system trained with only QA data. This work was funded by a grant from the Qatar National Research Fund under its National Priorities Research Program (NPRP) award number NPRP 09-410-1-069. Reported experimental work was performed at Qatar University in collaboration with University of Illinois.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.