Abstract

Machine translation performs automatic translation from one natural language to another. Neural machine translation attains a state-of-the-art approach in machine translation, but it requires adequate training data, which is a severe problem for low-resource language pairs translation. The concept of multimodal is introduced in neural machine translation (NMT) by merging textual features with visual features to improve low-resource pair translation. WAT2021 (Workshop on Asian Translation 2021) organizes a shared task of multimodal translation for English to Hindi. We have participated the same with team name CNLP-NITS-PP in two submissions: multimodal and text-only NMT. This work investigates phrase pairs injection via data augmentation approach and attains improvement over our previous work at WAT2020 on the same task in both text-only and multimodal NMT. We have achieved second rank on the challenge test set for English to Hindi multimodal translation where Bilingual Evaluation Understudy (BLEU) score of 39.28, Rank-based Intuitive Bilingual Evaluation Score (RIBES) 0.792097, and Adequacy-Fluency Metrics (AMFM) score 0.830230 respectively.

Highlights

  • Multimodal neural machine translation (NMT) (MNMT) intends to draw insights from the input data through different modalities like text, image, and audio

  • NMT performance can be enhanced by utilizing monolingual data (Sennrich et al, 2016; Zhang and Zong, 2016; Laskar et al, 2020b) and phrase pair injection (Sen et al, 2020), effective in low resource language pair translation

  • This paper aims English to Hindi translation using the multimodal concept by taking advantage of monolingual data and phrase pair injections to improve the translation quality at the WAT2021 translation task

Read more

Summary

Introduction

Multimodal NMT (MNMT) intends to draw insights from the input data through different modalities like text, image, and audio. Combining information from more than one modality attempts to amend the quality of low resource language translation. Encoder-decoder architecture is a widely used technique in the MT community for text-only-based NMT as it handles various issues like variable-length phrases using sequence to sequence learning, the problem of longterm dependency using Long Short Term Memory (LSTM) (Sutskever et al, 2014). NMT performance can be enhanced by utilizing monolingual data (Sennrich et al, 2016; Zhang and Zong, 2016; Laskar et al, 2020b) and phrase pair injection (Sen et al, 2020), effective in low resource language pair translation. This paper aims English to Hindi translation using the multimodal concept by taking advantage of monolingual data and phrase pair injections to improve the translation quality at the WAT2021 translation task

Related Works
Dataset Description
System Description
Data Augmentation
Data Preprocessing
Training
Testing
Findings
Result and Analysis
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call