Abstract

Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With low-resource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful. These models are based on initial alignments between grapheme source and phoneme target sequences. Inspired by sequence-to-sequence recurrent neural network-based translation methods, the current research presents an approach that applies an alignment representation for input sequences and pre-trained source and target embeddings to overcome the transliteration problem for a low-resource languages pair. We participated in the NEWS 2018 shared task for the English-Vietnamese transliteration task.

Highlights

  • Transliteration means the phonetic translation of the words in a source language (e.g. English) into equivalent words in a target language (e.g. Vietnamese)

  • We propose a new approach by using alignment representation for input sequences and pre-trained source/target embeddings in the input layer in order to build a neural network-based transliteration system to solve the problem of scattered data due to a low-resource language

  • Our proposed approach for an efficient transliteration consists of three main steps: (1) preprocessing, (2) modification of the input sequences based on alignment representation and (3) creation of an RNN-based machine transliteration

Read more

Summary

Introduction

Transliteration means the phonetic translation of the words in a source language (e.g. English) into equivalent words in a target language (e.g. Vietnamese). It entails transforming a word from one writing system (the "source word") to a phonetically equivalent word in another writing system (the "target word") (Knight and Graehl, 1998) This transformation requires a large set of rules defined by expert linguists to determine how the phonemes are aligned and to take into account the phonological system of the target language. Statistics for many words must be sparsely estimated (Sutskever et al, 2014; Jean et al, 2014) To deal with this linguistics aspect, neural network-based approaches use continuous-space representations of words or word embeddings, in which words that occur in similar context tend to be close to each other in representational space. The benefits of using neural networks, recurrent neural networks, to deal with sparse problem are very clear

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.