Neural Machine Translation Systems Research Articles

Multimodal machine translation (MMT) is an attractive application of neural machine translation (NMT) that is commonly incorporated with image information. However, the MMT models proposed thus far have only comparable or slightly better performance than their text-only counterparts. One potential cause of this infeasibility is a lack of large-scale data. Most previous studies mitigate this limitation by employing large-scale textual parallel corpora, which are more accessible than multimodal parallel corpora, in various ways. However, these corpora are still available on only a limited scale in low-resource language pairs or domains. In this study, we leveraged monolingual (or multimodal monolingual) corpora, which are available at scale in most languages and domains, to improve MMT models. Our approach follows that of previous unimodal works that use monolingual corpora to train the word embedding or language model and incorporate them into NMT systems. While these methods demonstrated the advantage of using pre-trained representations, there is still room for MMT models to improve. To this end, our system employs debiasing procedures for the word embedding and multimodal extension of the language model (visual-language model, VLM) to make better use of the pre-trained knowledge in the MMT task. The results of evaluations conducted on the de facto MMT dataset for the English–German translation indicate that the improvement obtained using well-tailored word embedding and VLM is approximately +1.84 BLEU and +1.63 BLEU, respectively. The evaluation on multiple language pairs reveals their adoptability across the languages. Beyond the success of our system, we also conducted an extensive analysis on VLM manipulation and showed promising areas for developing better MMT models by exploiting VLM; some benefits brought by either modality are missing, and MMT with VLM generates less fluent translations. Our code is available at https://github.com/toshohirasawa/mmt-with-monolingual-data.

Read full abstract

Background. There are not many machine translation companies on the market whose products are in demand. These are, for example, free and commercial products such as “GoogleTranslate”, “DeepLTranslator”, “ModernMT”, “Apertium”, “Trident”, to name a few. To implement a more efficient and productive process for developing high-quality neural machine translation systems (NMTS), appropriate scientifically based methods of NMTS engineering are needed in order to get a high-quality and competitive product as quickly as possible. Objective. The purpose of this article is to apply the Eriksson-Penker business profile to the development and formalization of a method for system engineering of NMTS. Methods. The idea behind the neural machine translation system engineering method is to apply the Eriksson-Penker system engineering methodology and business profile to formalize an ordered way to develop NMT systems. Results. The method of developing NMT systems based on the use of system engineering techniques consists of three main stages. At the first stage, the structure of the NMT system is modelled in the form of an Eriksson-Penker business profile. At the second stage, a set of processes is determined that is specific to the class of Data Science systems, and the international CRISP-DM standard. At the third stage, verification and validation of the developed NMTS is carried out. Conclusions. The article proposes a method of system engineering of NMTS based on the modified Erickson-Penker business profile representation of the system at the meta-level, as well as international process standards of Data Science and Data Mining. The effectiveness of using this method was studied on the example of developing a bidirectional English-Ukrainian NMTS EUMT (English-Ukrainian Machine Translator) and it was found that the EUMT system is at least as good as the quality of English-Ukrainian translation of the popular Google Translate translator. The full version code of the EUMT system is published on the GitHub platform and is available at: https://github.com/EugeneSel/EUMT.

Read full abstract

Neural Machine Translation Systems Research Articles

Related Topics

Articles published on Neural Machine Translation Systems

Neural machine translation model combining dependency syntax and LSTM

Machine Translation Systems for English Captions to Hindi Language Using Deep Learning

Pre-Trained Word Embedding and Language Model Improve Multimodal Machine Translation: A Case Study in Multi30K

A Simple Yet Robust Algorithm for Automatic Extraction of Parallel Sentences: A Case Study on Arabic-English Wikipedia Articles

Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

Improving the Performance of Vietnamese–Korean Neural Machine Translation with Contextual Embedding

Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

Tag-less back-translation

A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units.

METHOD OF SYSTEM ENGINEERING OF NEURAL MACHINE TRANSLATION SYSTEMS

Factors Behind the Effectiveness of an Unsupervised Neural Machine Translation System between Korean and Japanese

Investigating syntactic priming cumulative effects in MT-human interaction

Surprise Language Challenge: Developing a Neural Machine Translation System between Pashto and English in Two Months

Source-side Reordering to Improve Machine Translation between Languages with Distinct Word Orders

Cadlaws – An English–French Parallel Corpus of Legally Equivalent Documents

Reinforced NMT for Sentiment and Content Preservation in Low-resource Scenario

Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity

Philipp Koehn: Neural Machine Translation

Investigating usability in postediting neural machine translation: Evidence from translation trainees' self-perception and performance

POS-Tagging based Neural Machine Translation System for European Languages using Transformers

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Neural Machine Translation Systems Research Articles

Related Topics

Articles published on Neural Machine Translation Systems

Neural machine translation model combining dependency syntax and LSTM

Machine Translation Systems for English Captions to Hindi Language Using Deep Learning

Pre-Trained Word Embedding and Language Model Improve Multimodal Machine Translation: A Case Study in Multi30K

A Simple Yet Robust Algorithm for Automatic Extraction of Parallel Sentences: A Case Study on Arabic-English Wikipedia Articles

Dynamic Multi-Branch Layers for On-Device Neural Machine Translation

Improving the Performance of Vietnamese–Korean Neural Machine Translation with Contextual Embedding

Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages

Tag-less back-translation

A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units.

METHOD OF SYSTEM ENGINEERING OF NEURAL MACHINE TRANSLATION SYSTEMS

Factors Behind the Effectiveness of an Unsupervised Neural Machine Translation System between Korean and Japanese

Investigating syntactic priming cumulative effects in MT-human interaction

Surprise Language Challenge: Developing a Neural Machine Translation System between Pashto and English in Two Months

Source-side Reordering to Improve Machine Translation between Languages with Distinct Word Orders

Cadlaws – An English–French Parallel Corpus of Legally Equivalent Documents

Reinforced NMT for Sentiment and Content Preservation in Low-resource Scenario

Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity

Philipp Koehn: Neural Machine Translation

Investigating usability in postediting neural machine translation: Evidence from translation trainees' self-perception and performance

POS-Tagging based Neural Machine Translation System for European Languages using Transformers