Joint Contextual Modeling for ASR Correction and Language Understanding

Yue Weng,Alexandros Papangelis,Sai Sumanth Miryala,Hugh Williams,Mahdi Namazifar,Chandra Khatri,Gokhan Tur,Runze Wang,Franziska Bell,Piero Molino,Huaixiu Zheng

doi:10.1109/icassp40776.2020.9053213

Abstract

The quality of automatic speech recognition (ASR) is critical to Dialogue Systems as ASR errors propagate to and directly impact downstream tasks such as language understanding (LU). In this paper, we propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with LU to improve the performance of both tasks simultaneously. To measure the effectiveness of this approach we used a public benchmark, the 2nd Dialogue State Tracking (DSTC2) corpus. As a baseline approach, we trained task-specific Statistical Language Models (SLM) and fine-tuned state-of-the-art Generalized Pre-training (GPT) Language Model to re-rank the n-best ASR hypotheses, followed by a model to identify the dialog act and slots. i) We further trained ranker models using GPT and Hierarchical CNN-RNN models with discriminatory losses to detect the best output given n-best hypotheses. We extended these ranker models to first select the best ASR output and then identify the dialogue act and slots in an end to end fashion. ii) We also proposed a novel joint ASR error correction and LU model, a word confusion pointer network (WCN-Ptr) with multi-head self-attention on top, which consumes the word confusions populated from the n-best. We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Joint Contextual Modeling for ASR Correction and Language Understanding

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Improving ASR Error Detection with RNNLM Adaptation
Rahhal Errattahi ... Hassan Ouahmane
-
Rahhal Errattahi, et. al.Rahhal Errattahi ... Hassan Ouahmane
01 Dec 2018
01 Dec 2018

The Benefit Obtained from Visually Displayed Text from an Automatic Speech Recognizer During Listening to Speech Presented in Noise
Adriana A Zekveld ... Marcel S M G Vlaming
Ear & Hearing | VOL. 29
Adriana A Zekveld, et. al.Adriana A Zekveld ... Marcel S M G Vlaming
01 Dec 2008
Ear & Hearing | VOL. 29

Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
Longshaokan Wang ... Lazaros Polymenakos
-
Longshaokan Wang, et. al.Longshaokan Wang ... Lazaros Polymenakos
01 Jan 2020
01 Jan 2020

Contrastive Learning for Robust Neural Machine Translation with ASR Errors
Dongyang Hu ... Junhui Li
-
Dongyang Hu, et. al.Dongyang Hu ... Junhui Li
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Joint Contextual Modeling for ASR Correction and Language Understanding

Abstract

Talk to us

Similar Papers