TLSPG: Transfer learning-based semi-supervised pseudo-corpus generation approach for zero-shot translation

Amit Kumar,Rajesh Kumar Mundotiya,Ajay Pratap,Anil Kumar Singh

doi:10.1016/j.jksuci.2022.03.008

Amit Kumar, Rajesh Kumar Mundotiya + Show 2 more

Open Access

https://doi.org/10.1016/j.jksuci.2022.03.008

Copy DOI

Abstract

Machine Translation (MT) has come a long way in recent years, but it still suffers from data scarcity issue due to lack of parallel corpora for low (or sometimes zero) resource languages. However, Transfer Learning (TL) is one of the directions widely used for low-resource machine translation systems to overcome this issue. Creating parallel corpus for such languages is another way of dealing with data scarcity, yet costly, time-consuming and laborious task. In order to avoid the above listed limitations of parallel corpus formation, we present a TL-based Semi-supervised Pseudo-corpus Generation (TLSPG) approach for zero-shot MT systems. It generates the pseudo corpus by exploiting the relatedness between low resource language pairs and zero-resource language pairs via TL approach. It is further empirically ascertained in our experiments that such relatedness helps improve the performance of zero-shot MT systems. Experiments on zero-resource language pairs show that our approach effectively outperforms the existing state-of-the-art models, yielding improvement of +15.56,+8.13,+3.98 and +2 BLEU points for Bhojpuri→Hindi, Magahi→Hindi, Hindi→Bhojpuri and Hindi→Magahi, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of King Saud University - Computer and Information Sciences	Publication Date: Mar 25, 2022
Citations: 2	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

TLSPG: Transfer learning-based semi-supervised pseudo-corpus generation approach for zero-shot translation

Abstract

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences

Lead the way for us

Similar Papers

Neural Network Based Machine Translation Systems for Low Resource Languages: A Review
H S Sreedeepa ... Sumam Mary Idicula
-
H S Sreedeepa, et. al.H S Sreedeepa ... Sumam Mary Idicula
22 Dec 2023
22 Dec 2023

Unsupervised SMT: an analysis of Indic languages and a low resource language
Shefali Saxena ... Philemon Daniel
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 36
Shefali Saxena, et. al.Shefali Saxena ... Philemon Daniel
29 Aug 2022
Journal of Experimental & Theoretical Artificial Intelligence | VOL. 36

Leveraging Additional Resources for Improving Statistical Machine Translation on Asian Low-Resource Languages
Hai-Long Trieu ... Duc-Vu Tran
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18
Hai-Long Trieu, et. al.Hai-Long Trieu ... Duc-Vu Tran
17 Jun 2019
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18

Transfer Learning for Low-Resource Neural Machine Translation
Barret Zoph ... Deniz Yuret
-
Barret Zoph, et. al.Barret Zoph ... Deniz Yuret
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TLSPG: Transfer learning-based semi-supervised pseudo-corpus generation approach for zero-shot translation

Abstract

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences