End-to-End Neural Transformer Based Spoken Language Understanding

Martin Radfar,Siegfried Kunzmann,Athanasios Mouchtaris

doi:10.21437/interspeech.2020-1963

Abstract

Spoken language understanding (SLU) refers to the process of inferring the semantic information from audio signals. While the neural transformers consistently deliver the best performance among the state-of-the-art neural architectures in field of natural language processing (NLP), their merits in a closely related field, i.e., spoken language understanding (SLU) have not beed investigated. In this paper, we introduce an end-to-end neural transformer-based SLU model that can predict the variable-length domain, intent, and slots vectors embedded in an audio signal with no intermediate token prediction architecture. This new architecture leverages the self-attention mechanism by which the audio signal is transformed to various sub-subspaces allowing to extract the semantic context implied by an utterance. Our end-to-end transformer SLU predicts the domains, intents and slots in the Fluent Speech Commands dataset with accuracy equal to 98.1 \%, 99.6 \%, and 99.6 \%, respectively and outperforms the SLU models that leverage a combination of recurrent and convolutional neural networks by 1.4 \% while the size of our model is 25\% smaller than that of these architectures. Additionally, due to independent sub-space projections in the self-attention layer, the model is highly parallelizable which makes it a good candidate for on-device SLU.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

End-to-End Neural Transformer Based Spoken Language Understanding

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

FANS: Fusing ASR and NLU for On-Device SLU
Martin Radfar ... Ariya Rastrow
-
Martin Radfar, et. al.Martin Radfar ... Ariya Rastrow
30 Aug 2021
30 Aug 2021

Ensemble Chinese End-to-End Spoken Language Understanding for Abnormal Event Detection from Audio Stream
Haoran Wei ... Sen Yang
-
Haoran Wei, et. al.Haoran Wei ... Sen Yang
24 Sep 2021
24 Sep 2021

Investigating Adaptation and Transfer Learning for End-to-End Spoken Language Understanding from Speech
Natalia Tomashenko ... Yannick Estève
-
Natalia Tomashenko, et. al.Natalia Tomashenko ... Yannick Estève
15 Sep 2019
15 Sep 2019

End-to-End Spoken Language Understanding Using Joint CTC Loss and Self-Supervised, Pretrained Acoustic Encoders
Jixuan Wang ... Clement Chung
-
Jixuan Wang, et. al.Jixuan Wang ... Clement Chung
04 Jun 2023
04 Jun 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

End-to-End Neural Transformer Based Spoken Language Understanding

Abstract

Talk to us

Similar Papers