Abstract

PurposeExisting virtual agents (VAs) present in dialogue systems are either information retrieval based or static goal-driven. However, in real-world situations, end-users might not have a known and fixed goal beforehand for the task, i.e., they may upgrade/downgrade/update their goal components in real-time to maximize their utility values. Existing VAs are unable to handle such dynamic goal-oriented situations.MethodologyDue to the absence of any related dialogue dataset where such choice deviations are present, we have created a conversational dataset called Deviation adapted Virtual Agent(DevVA), with the manual annotation of its corresponding intents, slots, and sentiment labels. A Dynamic Goal Driven Dialogue Agent (DGDVA) has been developed by incorporating a Dynamic Goal Driven Module (GDM) on top of a deep reinforcement learning based dialogue manager. In the course of a conversation, the user sentiment provides grounded feedback about agent behavior, including goal serving action. User sentiment appears to be an appropriate indicator for goal discrepancy that guides the agent to complete the user’s desired task with gratification. The negative sentiment expressed by the user about an aspect of the provided choice is treated as a discrepancy that is being resolved by the GDM depending upon the observed discrepancy and current dialogue state. The goal update capability and the VA’s interactiveness trait enable end-users to accomplish their desired task satisfactorily.FindingsThe obtained experimental results illustrate that DGDVA can handle dynamic goals with maximum user satisfaction and a significantly higher success rate. The interaction drives the user to decide its final goal through the latent specification of possible choices and information retrieved and provided by the dialogue agent. Through the experimental results (qualitative and quantitative), we firmly conclude that the proposed sentiment-aware VA adapts users’ dynamic behavior for its goal setting with substantial efficacy in terms of primary objective i.e., task success rate (0.88).Practical implicationsIn real world, it can be argued that many people do not have a predefined and fixed goal for tasks such as online shopping, movie booking & restaurant booking, etc. They tend to explore the available options first which are aligned with their minimum requirements and then decide one amongst them. The DGDVA provides maximum user satisfaction as it enables them to accomplish a dynamic goal that leads to additional utilities along with the essential ones.OriginalityTo the best of our knowledge, this is the first effort towards the development of A Dynamic Goal Adapted Task-Oriented Dialogue Agent that can serve user goals dynamically until the user is satisfied.

Highlights

  • In recent times, conversational artificial intelligence has become one of the prominent research areas because of its utility and efficacy [1]

  • This paper presents the first step towards developing a dynamic goal oriented virtual agent which is capable of handling the variations in user goal in real-time

  • The variation in goal can arise because a user may want to decide his/her goal depending upon the determined goal components and virtual agents (VAs) serving capability or due to a mismatch in the implicit slot values of the user

Read more

Summary

Introduction

Conversational artificial intelligence has become one of the prominent research areas because of its utility and efficacy [1]. The fundamental task of a dialogue manager is to optimize dialogue policy, which decides the behavior of the dialogue system based on the given dialogue history This dialogue optimization [9] problem can be viewed as a sequential decision making problem that can be solved efficiently through reinforcement learning [14] technique. The first one is neural Sequence-to-Sequence (Seq2Seq) supervised approach [16], where an agent learns what to generate as a response given previous user utterances. The latter approach treats the dialogue manager as a Partially Observable Markov Decision Problem (POMDP) [17], which can be optimized by Reinforcement Learning (RL) technique. One feasible and well-accepted approach is to build a user simulator [18] based upon the problem and nature of the corpus

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call