Improved Unsupervised Statistical Machine Translation via Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages

Shefali Saxena,Uttkarsh Chaurasia,Nitin Bansal,Philemon Daniel

doi:10.1080/03772063.2022.2098189

Abstract

ABSTRACT Besides word order, word choice is a key stumbling block for machine translation (MT) in morphologically rich languages due to homonyms and polysemous difficulties. On the other hand, un-translated/improperly translated words are a severe issue for Statistical Machine Translation (SMT) models. The quantity of parallel training corpus has limited unsupervised SMT (USMT) systems. Still, current research lines have successfully trained SMT systems in an unsupervised manner using monolingual data alone. However, there is still a need to enhance the translation quality of the MT output due to unaligned and improperly sensed words. This problem is addressed by incorporating unsupervised Word Sense Disambiguation (WSD) into the decoding phase of USMT. The work provided a compendium of SMT systems for five translation tasks, i.e. En→Indic languages for the WMT test dataset and evaluated on BLEU and METEOR evaluation metrics. The studies were performed on En→Hi, En→Kn, En→Ta, En→Te, and En→Be tasks and showed an improvement in BLEU points by 2.3, 2.68, 0.78, 2.32, and 1.79, respectively, and METEOR points by 1.07, 1.34, 0.72, 0.693, and 1.191, respectively, over the baseline model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improved Unsupervised Statistical Machine Translation via Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages

Abstract

Talk to us

Similar Papers

More From: IETE Journal of Research

Lead the way for us

Journal: IETE Journal of Research	Publication Date: Jul 23, 2022
Citations: 2

Similar Papers

Using Statistical Machine Translation to Grade Training Data
Andrew Finch ... Eiichiro Sumita
-
Andrew Finch, et. al.Andrew Finch ... Eiichiro Sumita
01 Dec 2008
01 Dec 2008

Statistical vs. Rule-Based Machine Translation: A Comparative Study on Indian Languages
S Sreelekha ... Pushpak Bhattacharyya
-
S Sreelekha, et. al.S Sreelekha ... Pushpak Bhattacharyya
28 Dec 2017
28 Dec 2017

Training, Enhancing, Evaluating and Using MT Systems with Comparable Data
Bogdan Babych ... Mārcis Pinnis
-
Bogdan Babych, et. al.Bogdan Babych ... Mārcis Pinnis
01 Jan 2019
01 Jan 2019

Hybrid data-driven models of machine translation
Declan Groves ... Andy Way
Machine Translation | VOL. 19
Declan Groves, et. al.Declan Groves ... Andy Way
02 Nov 2006
Machine Translation | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved Unsupervised Statistical Machine Translation via Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages

Abstract

Talk to us

Similar Papers

More From: IETE Journal of Research