Abstract

We propose word-region alignment-guided multimodal neural machine translation (MNMT), a novel model for MNMT that links the semantic correlation between textual and visual modalities using word-region alignment (WRA). Existing studies on MNMT have mainly focused on the effect of integrating visual and textual modalities. However, they do not leverage the semantic relevance between the two modalities. We advance the semantic correlation between textual and visual modalities in MNMT by incorporating WRA as a bridge. This proposal has been implemented on two mainstream architectures of neural machine translation (NMT): the recurrent neural network (RNN) and the transformer. Experiments on two public benchmarks, English–German and English–French translation tasks using the Multi30k dataset and English–Japanese translation tasks using the Flickr30kEnt-JP dataset prove that our model has a significant improvement with respect to the competitive baselines across different evaluation metrics and outperforms most of the existing MNMT models. For example, 1.0 BLEU scores are improved for the English–German task and 1.1 BLEU scores are improved for the English–French task on the Multi30k test2016 set; and 0.7 BLEU scores are improved for the English–Japanese task on the Flickr30kEnt-JP test set. Further analysis demonstrates that our model can achieve better translation performance by integrating WRA, leading to better visual information use.

Highlights

  • THIS document is a template for Microsoft Word versions 6.0 or later

  • Types of Graphics The following list outlines the different types of graphics published in IEEE journals

  • Most journals require that final submissions be uploaded through ScholarOne Manuscripts, some may still accept final submissions via email

Read more

Summary

INTRODUCTION

THIS document is a template for Microsoft Word versions 6.0 or later. If you are reading a paper or PDF version of this document, please download the electronic file, trans_jour.docx, from the IEEE Web site at www.ieee.org/authortools so you can use it to prepare your manuscript. If you would prefer to use LaTeX, download IEEE’s LaTeX style and sample files from the same Web page. You can explore using the Overleaf editor at https://www.overleaf.com/blog/278-how-to-use-overleaf-withieee-collabratec-your-quick-guide-to-gettingstarted#.Vp6tpPkrKM9. If your paper is intended for a conference, please contact your conference editor concerning acceptable word processor formats for your particular conference

GUIDELINES FOR MANUSCRIPT PREPARATION
Abbreviations and Acronyms
Other Recommendations
Equations
SOME COMMON MISTAKES
GUIDELINES FOR GRAPHICS PREPARATION AND SUBMISSION
Sizing of Graphics
Resolution
Color Space
Accepted Fonts Within Figures
Using Labels Within Figures
VIII. SUBMITTING YOUR PAPER FOR REVIEW
Final Stage Using ScholarOne Manuscripts
Copyright Form
IEEE PUBLISHING POLICY
PUBLICATION PRINCIPLES
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call