Exploring Neural Machine Translation for Sinhala-Tamil Languages Pair

L N A S H Nissanka,B H R Pushpananda,A R Weerasinghe

doi:10.1109/icter51097.2020.9325466

Abstract

In the face of rapid globalization, the concept of translation performs the most important role in continuing the existence of native languages. Most of the research on Natural Language Processing in Neural Machine Translation has achieved an impressive result through parallel corpus dataset. Low resourced languages confront low performance due to the lack of parallel corpus data. Creating parallel corpus for language pair is more expensive and needs the persons who are expert knowledge for both languages. In this research, we present the availability of developing the translator for Sinhala-Tamil languages pair using monolingual corpus dataset. In this paper, the Byte Pair Encoding (BPE) is applied for overcoming the Out-Of-Vocabulary (OOV) problem in both Sinhala and Tamil languages. Our first part of the research is using monolingual word embedding approach for developing the translation in between Sinhala-Tamil language pair only using monolingual corpora. The second part of the research we use both parallel and monolingual corpus data with transformer architecture. The BLEU score and the synonyms analysis are used to evaluate the approach we suggested.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring Neural Machine Translation for Sinhala-Tamil Languages Pair

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Transliteration and Byte Pair Encoding to Improve Tamil to Sinhala Neural Machine Translation
Pasindu Tennage ... Surangika Ranathunga
-
Pasindu Tennage, et. al.Pasindu Tennage ... Surangika Ranathunga
01 May 2018
01 May 2018

Controlling byte pair encoding for neural machine translation
Alfred John Tacorda ... Rachel Edita Roxas
-
Alfred John Tacorda, et. al.Alfred John Tacorda ... Rachel Edita Roxas
01 Dec 2017
01 Dec 2017

Bidirectional LSTMs with Byte Pair Encoding in NMT for CLIR using English and Telugu Parallel Corpus
Et Al B N V Narasimha Raju
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11
Et Al B N V Narasimha RajuEt Al B N V Narasimha Raju
30 Oct 2023
International Journal on Recent and Innovation Trends in Computing and Communication | VOL. 11

The impact of some linguistic features on the quality of neural machine translation
Elena A Shukshina
Journal of Applied Linguistics and Lexicography | VOL. 1
Elena A ShukshinaElena A Shukshina
01 Jan 2019
Journal of Applied Linguistics and Lexicography | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Neural Machine Translation for Sinhala-Tamil Languages Pair

Abstract

Talk to us

Similar Papers