A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation

Yosuke Higuchi,Yuya Fujita,Nanxin Chen,Hirofumi Inaguma,Jaesong Lee,Tianzi Wang,Jumon Nozaki,Shinji Watanabe,Tatsuya Komatsu

doi:10.1109/asru51503.2021.9688157

Abstract

Non-autoregressive (NAR) models simultaneously generate multiple outputs in a sequence, which significantly reduces the inference speed at the cost of accuracy drop compared to autoregressive baselines. Showing great potential for real-time applications, an increasing number of NAR models have been explored in different fields to mitigate the performance gap against AR models. In this work, we conduct a comparative study of various NAR modeling methods for end-to-end automatic speech recognition (ASR). Experiments are performed in the state-of-the-art setting using ESPnet. The results on various tasks provide interesting findings for developing an understanding of NAR ASR, such as the accuracy-speed trade-off and robustness against long-form utterances. We also show that the techniques can be combined for further improvement and applied to NAR end-to-end speech translation. All the implementations are publicly available to encourage further research in NAR speech processing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Study of Non-autoregressive Model for Sequence Generation
Yi Ren ... Jinglin Liu
-
Yi Ren, et. al.Yi Ren ... Jinglin Liu
01 Jan 2020
01 Jan 2020

Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition
Zhengkun Tian ... Jianhua Tao
IEEE Signal Processing Letters | VOL. 29
Zhengkun Tian, et. al.Zhengkun Tian ... Jianhua Tao
01 Jan 2021
IEEE Signal Processing Letters | VOL. 29

Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
...
-
, et. al. ...
29 Jun 2022
29 Jun 2022

Improving Non-Autoregressive Speech Recognition with Autoregressive Pretraining
Yanjia Li ... Lahiru Samarakoon
-
Yanjia Li, et. al.Yanjia Li ... Lahiru Samarakoon
04 Jun 2023
04 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation

Abstract

Talk to us

Similar Papers