Abstract

Extracting pronunciation from written text is necessary in many application areas, especially in text-to-speech synthesis. ‘Bangla’ is not completely a phonetic language, meaning there is not always direct mapping from orthography to pronunciation. It mainly suffers from ‘schwa deletion’ problem, along with some other ambiguous letters and conjuncts. Rule-based approaches cannot completely solve this problem. In this paper, we propose to adopt an Encoder-Decoder based neural machine translation (NMT) model for determining pronunciations of Bangla words. We mapped the pronunciation problem into a sequence-to-sequence problem and used two ‘Gated Recurrent Unit Recurrent Neural Network's (GRU-RNNs) for our model. We fed the model with two types of input data. In one model we used ‘raw’ words and in other model we used ‘pre-processed’ words (normalized by hand-written rules) as input. Both experiments showed promising results and can be used in any practical application.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.