Robust Understanding of Robot-Directed Speech Commands Using Sequence to Sequence With Noise Injection.

Yuuki Tada,Yoshinobu Hagiwara,Hiroki Tanaka,Tadahiro Taniguchi

doi:10.3389/frobt.2019.00144

Abstract

This paper describes a new method that enables a service robot to understand spoken commands in a robust manner using off-the-shelf automatic speech recognition (ASR) systems and an encoder-decoder neural network with noise injection. In numerous instances, the understanding of spoken commands in the area of service robotics is modeled as a mapping of speech signals to a sequence of commands that can be understood and performed by a robot. In a conventional approach, speech signals are recognized, and semantic parsing is applied to infer the command sequence from the utterance. However, if errors occur during the process of speech recognition, a conventional semantic parsing method cannot be appropriately applied because most natural language processing methods do not recognize such errors. We propose the use of encoder-decoder neural networks, e.g., sequence to sequence, with noise injection. The noise is injected into phoneme sequences during the training phase of encoder-decoder neural network-based semantic parsing systems. We demonstrate that the use of neural networks with a noise injection can mitigate the negative effects of speech recognition errors in understanding robot-directed speech commands i.e., increase the performance of semantic parsing. We implemented the method and evaluated it using the commands given during a general purpose service robot (GPSR) task, such as a task applied in RoboCup@Home, which is a standard service robot competition for the testing of service robots. The results of the experiment show that the proposed method, namely, sequence to sequence with noise injection (Seq2Seq-NI), outperforms the baseline methods. In addition, Seq2Seq-NI enables a robot to understand a spoken command even when the speech recognition by an off-the-shelf ASR system contains recognition errors. Moreover, in this paper we describe an experiment conducted to evaluate the influence of the injected noise and provide a discussion of the results.

Highlights

Speech recognition errors are significant in practical tasks provided by service robots
The spoken commands given by a human user are conventionally recognized and understood by a robot in the following manner: First, the robot recognizes a sentence spoken by a human user by applying an automatic speech recognition (ASR) system such as Google Cloud Speech-to-Text API1, CMU Sphinx2, or Julius3
We considered the understanding to be a success if the robot could translate an input phoneme or word sequence into a groundtruth command sequence

Summary

Introduction

Speech recognition errors are significant in practical tasks provided by service robots. The spoken commands given by a human user are conventionally recognized and understood by a robot in the following manner: First, the robot recognizes a sentence spoken by a human user by applying an automatic speech recognition (ASR) system such as Google Cloud Speech-to-Text API1, CMU Sphinx, or Julius. The syntactic and semantic parsing for service robots involves a mapping of a recognized sentence to a sequence of commands that is written in an artificial language that can be understood and carried out by the robots (Poon, 2013).

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in robotics and AI	Publication Date: Jan 14, 2020
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Robust Understanding of Robot-Directed Speech Commands Using Sequence to Sequence With Noise Injection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in robotics and AI

Lead the way for us

Similar Papers

Neural Semantic Parsing with Anonymization for Command Understanding in General-Purpose Service Robots
Nick Walker ... Yu-Tang Peng
-
Nick Walker, et. al.Nick Walker ... Yu-Tang Peng
01 Jan 2019
01 Jan 2019

Challenges of Automatic Speech Recognition for medical interviews - research for Polish language
Karolina Kuligowska ... Marek Koniew
Procedia Computer Science | VOL. 225
Karolina Kuligowska, et. al.Karolina Kuligowska ... Marek Koniew
01 Jan 2023
Procedia Computer Science | VOL. 225

Using Auxiliary Sources of Knowledge for Automatic Speech Recognition

-

01 Jan 2004
01 Jan 2004

Semantic parsing using word confusion networks with conditional random fields
Gokhan Tur ... Dilek Hakkani-Tür
-
Gokhan Tur, et. al.Gokhan Tur ... Dilek Hakkani-Tür
25 Aug 2013
25 Aug 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Understanding of Robot-Directed Speech Commands Using Sequence to Sequence With Noise Injection.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in robotics and AI