Abstract

Spoken language understanding (SLU) is a fundamental to service robot handling of natural language task requests. There are two main basic problems in SLU, namely, intent determination (ID) and slot filling (SF). The slot-gated recurrent neural network joint model for the two tasks has been proven to be superior to the single model, and has achieved the most advanced performance. However, in the context of task requests for home service robots, there exists a phenomenon that the information about a current word is strongly dependent on key verbs in the sentence, and it is difficult to capture this relation well with current methods. In this paper, we extract the key instructional verb containing greater core task information based on dependency parsing, and construct a feature that combines the key verb with its contextual information to solve this problem. To further improve the performance of the slot-gated model, we consider the strong relations between intent and slot. By introducing intent attention vectors into the slot attention vectors through the global-level gate and element-level gate, a novel dual slot-gated mechanism is proposed to explicitly model the complex relations between the results of the ID tasks and SF prediction tasks and optimize the global prediction results. Our experimental results on the ATIS dataset and an extended home service task (SRTR) dataset based on FrameNet show that the proposed method outperforms the most advanced methods in both tasks. Especially, for SRTR, the results of SF, ID, and sentence-level semantic frame-filling are improved by 1.7%, 1.1%, and 1.7%, respectively.

Highlights

  • In the past decade, human-computer collaboration promoted by natural language has attracted a great deal of attention in the field of intelligent robots, which includes daily assistance [1], medical care [2], manufacturing [3], indoor or outdoor navigation [4]–[6], and social companionship [7]–[9]

  • One of the possible reasons for this improvement is that some instructions issued in spoken language do not contain retrievable verbs in synonym dictionaries, which results in matching failure, and the use of the recurrent neural network model can infer the intents of instructions from contextual information

  • WORK Using the characteristics of spoken task requests in the field of home service robots, this paper proposes a task understanding method based on the key verb and its contextual features

Read more

Summary

INTRODUCTION

Human-computer collaboration promoted by natural language has attracted a great deal of attention in the field of intelligent robots, which includes daily assistance [1], medical care [2], manufacturing [3], indoor or outdoor navigation [4]–[6], and social companionship [7]–[9]. The focus of this paper is to propose a method to fully mine the association information between key verbs and context words in task requests, and to create a mapping from sentence to expected robot action frame. By introducing the features of key verbs and their contextual information into the model input, the performance of our proposed recurrent neural network with a slot-gated mechanism in the joint ID and SF tasks is improved. Popular methods include recurrent neural network (RNN) [26], the joint model for ID and SF [27], and the attention-based model [28] Most of these studies focus on simple query statements, but the performance improvement in task understanding for service robots is not significant [29]. The prediction is a negative example, and the actual is a negative example, called true negative (TN)

FEATURE BASED ON KEY VERB CONTEXT
ATTENTION-BASED RNN MODEL
JOINT OPTIMIZATION MODEL WITH DUAL SLOT-GATED MECHANISM
EXPERIMENT
Findings
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call