Distilling Knowledge in Machine Translation of Agglutinative Languages with Backward and Morphological Decoders

Telem Joyson Singh,Sanasam Ranbir Singh,Priyankoo Sarmah

doi:10.1145/3703455

Telem Joyson Singh, Sanasam Ranbir Singh + Show 1 more

Open Access

PDF Available

https://doi.org/10.1145/3703455

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Agglutinative languages often have morphologically complex words(MCWs) composed of multiple morphemes arranged in a hierarchical structure, posing significant challenges in translation tasks. We present a novel Knowledge Distillation approach tailored for improving the translation of such languages. Our method involves an encoder, a forward decoder, and two auxiliary decoders: a backward decoder and a morphological decoder. The forward decoder generates target morphemes autoregressively and is augmented by distilling knowledge from the auxiliary decoders. The backward decoder incorporates future context, while the morphological decoder integrates target-side morphological information. We have also designed a reliability estimation method to selectively distill only the reliable knowledge from these auxiliary decoders. Our approach relies on morphological word segmentation. We show that the word segmentation method based on unsupervised morphology learning outperforms the commonly used Byte Pair Encoding method on highly agglutinative languages in translation tasks. Our experiments conducted on English-Tamil, English-Manipuri, and English-Marathi datasets show that our proposed approach achieves significant improvements over strong Transformer-based NMT baselines.

Full Text