SIMULEVAL: An Evaluation Toolkit for Simultaneous Translation

Xutai Ma,Mohammad Javad Dousti,Juan Pino,Changhan Wang,Jiatao Gu

doi:10.18653/v1/2020.emnlp-demos.19

Abstract

Simultaneous translation on both text and speech focuses on a real-time and low-latency scenario where the model starts translating before reading the complete source input. Evaluating simultaneous translation models is more complex than offline models because the latency is another factor to consider in addition to translation quality. The research community, despite its growing focus on novel modeling approaches to simultaneous translation, currently lacks a universal evaluation procedure. Therefore, we present SimulEval, an easy-to-use and general evaluation toolkit for both simultaneous text and speech translation. A server-client scheme is introduced to create a simultaneous translation scenario, where the server sends source input and receives predictions for evaluation and the client executes customized policies. Given a policy, it automatically performs simultaneous decoding and collectively reports several popular latency metrics. We also adapt latency metrics from text simultaneous translation to the speech task. Additionally, SimulEval is equipped with a visualization interface to provide better understanding of the simultaneous decoding process of a system. SimulEval has already been extensively used for the IWSLT 2020 shared task on simultaneous speech translation. Code will be released upon publication.

Highlights

While the translation quality is usually measured by BLEU (Papineni et al, 2002; Post, 2018), a wide variety of latency measurements have been introduced, such as Average Proportion (AP) (Cho and Esipova, 2016), Continues Wait Length (CW) (Gu et al, 2017), Average Lagging (AL) (Ma et al, 2019), Differentiable Average Lagging (DAL) (Cherry and Foster, 2019), and so on
The latency evaluation processes across different works are not consistent: 1) the latency metric definitions are not precise enough with respect to text segmentation; 2) the definitions are not precise enough with respect to the speech segmentation, for example some models are evaluated on speech segments (Ren et al, 2020) while others are evaluated on time duration (Ansari et al, 2020); 3) little prior work has released implementations of the decoding process and latency measurement
While all latency metrics have been defined for text translation, we discuss issues and solutions when adapting them to the task of simultaneous speech translation

Summary

Introduction

Simultaneous translation, the task of generating translations before reading the entire text or speech source input, has become an increasingly popular topic for both text and speech translation The server provides source input (text or audio) upon request from the client, receives predictions from the client and returns different evaluation metrics when the translation process is complete. SIMULEVAL has built-in support for quality metrics such as BLEU (Papineni et al, 2002; Post, 2018), TER (Snover et al, 2006) and METEOR (Banerjee and Lavie, 2005), and latency metrics such as AP, AL and DAL. Usage instructions and a case study are provided before concluding

Task Formalization

Existing Text Latency Metrics

Adapting Metrics to the Speech Task

Server

User-Defined Agent

Client

Evaluation

Visualization

User-Defined Client

Case Study

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SIMULEVAL: An Evaluation Toolkit for Simultaneous Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 32	License type: cc-by

Similar Papers

ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
Maha Elbayad ... Natalia Tomashenko
-
Maha Elbayad, et. al.Maha Elbayad ... Natalia Tomashenko
01 Jan 2020
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
Maha Elbayad ... Natalia Tomashenko

The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Dan Liu ... Mengge Du
-
Dan Liu, et. al.Dan Liu ... Mengge Du
01 Jan 2020
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Dan Liu ... Mengge Du

SimulSpeech: End-to-End Simultaneous Speech to Text Translation
Yi Ren ... Xu Tan
-
Yi Ren, et. al.Yi Ren ... Xu Tan
01 Jan 2020
01 Jan 2020

Simultaneous translation of lectures and speeches
Christian Fügen ... Alex Waibel
Machine Translation | VOL. 21
Christian Fügen, et. al.Christian Fügen ... Alex Waibel
01 Dec 2007
Machine Translation | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SIMULEVAL: An Evaluation Toolkit for Simultaneous Translation

Abstract

Highlights

Summary

Talk to us

Similar Papers