Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Ngoc Tan Le,Fatiha Sadat

doi:10.18653/v1/w18-2414

Abstract

Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With low-resource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful. These models are based on initial alignments between grapheme source and phoneme target sequences. Inspired by sequence-to-sequence recurrent neural network-based translation methods, the current research presents an approach that applies an alignment representation for input sequences and pre-trained source and target embeddings to overcome the transliteration problem for a low-resource languages pair. We participated in the NEWS 2018 shared task for the English-Vietnamese transliteration task.

Highlights

Transliteration means the phonetic translation of the words in a source language (e.g. English) into equivalent words in a target language (e.g. Vietnamese)
We propose a new approach by using alignment representation for input sequences and pre-trained source/target embeddings in the input layer in order to build a neural network-based transliteration system to solve the problem of scattered data due to a low-resource language
Our proposed approach for an efficient transliteration consists of three main steps: (1) preprocessing, (2) modification of the input sequences based on alignment representation and (3) creation of an RNN-based machine transliteration

Summary

Introduction

Transliteration means the phonetic translation of the words in a source language (e.g. English) into equivalent words in a target language (e.g. Vietnamese). It entails transforming a word from one writing system (the "source word") to a phonetically equivalent word in another writing system (the "target word") (Knight and Graehl, 1998) This transformation requires a large set of rules defined by expert linguists to determine how the phonemes are aligned and to take into account the phonological system of the target language. Statistics for many words must be sparsely estimated (Sutskever et al, 2014; Jean et al, 2014) To deal with this linguistics aspect, neural network-based approaches use continuous-space representations of words or word embeddings, in which words that occur in similar context tend to be close to each other in representational space. The benefits of using neural networks, recurrent neural networks, to deal with sparse problem are very clear

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2018
Citations: 34	License type: cc-by

Similar Papers

Low-Resource Machine Transliteration Using Recurrent Neural Networks
Ngoc Tan Le ... Fatiha Sadat
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18
Ngoc Tan Le, et. al.Ngoc Tan Le ... Fatiha Sadat
16 Jan 2019
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18

Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network
Keisuke Kinoshita ... Tsubasa Ochiai
-
Keisuke Kinoshita, et. al.Keisuke Kinoshita ... Tsubasa Ochiai
01 May 2020
01 May 2020

Multilingual End-to-End Speech Translation
Hirofumi Inaguma ... Tatsuya Kawahara
-
Hirofumi Inaguma, et. al.Hirofumi Inaguma ... Tatsuya Kawahara
01 Dec 2019
01 Dec 2019

Development of Artificial Intelligence and Prospects for Its Application
V Ya Dmitriev ... T A Ignat'Eva
Economics and Management | VOL. 27
V Ya Dmitriev, et. al.V Ya Dmitriev ... T A Ignat'Eva
01 May 2021
Economics and Management | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Abstract

Highlights

Summary

Talk to us

Similar Papers