Abstract

This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2020, offline speech translation and simultaneous speech translation. ON-TRAC Consortium is composed of researchers from three French academic laboratories: LIA (Avignon Universite), LIG (Universite Grenoble Alpes), and LIUM (Le Mans Universite). Attention-based encoder-decoder models, trained end-to-end, were used for our submissions to the offline speech translation track. Our contributions focused on data augmentation and ensembling of multiple models. In the simultaneous speech translation track, we build on Transformer-based wait-k models for the text-to-text subtask. For speech-to-text simultaneous translation, we attach a wait-k MT system to a hybrid ASR system. We propose an algorithm to control the latency of the ASR+MT cascade and achieve a good latency-quality trade-off on both sub-tasks.

Highlights

  • While cascaded speech-to-text translation (AST) systems (combining source language speech recognition (ASR) and source-to-target text translation (MT)) remain state-of-the-art, recent works have attempted to build end-to-end automatic speech translation (AST) with very encouraging results (Bérard et al, 2016; Weiss et al, 2017; Bérard et al, 2018; Jia et al, 2019; Sperber et al, 2019)

  • We evaluate four wait-k systems each trained with a value of ktrain in {5, 7, 9, ∞} and decoded with keval ranging from 2 to 11

  • The results demonstrate that multipath is competetive with wait-k without the need to select which path to optimize

Read more

Summary

Introduction

While cascaded speech-to-text translation (AST) systems (combining source language speech recognition (ASR) and source-to-target text translation (MT)) remain state-of-the-art, recent works have attempted to build end-to-end AST with very encouraging results (Bérard et al, 2016; Weiss et al, 2017; Bérard et al, 2018; Jia et al, 2019; Sperber et al, 2019) This year, IWSLT 2020 offline translation track attempts to evaluate if endto-end AST will close the gap with cascaded AST for the English-to-German language pair. Another increasingly popular topic is simultaneous (online) machine translation which consists in generating an output hypothesis before the entire input sequence is available. ON-TRAC Consortium is composed of researchers from three French academic laboratories: LIA (Avignon Université), LIG (Université Grenoble Alpes), and LIUM (Le Mans Université)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call