Using Reinforcement Learning to Model Incrementality in a Fast-Paced Dialogue Game

Ramesh Manuvinakurike,David Devault,Kallirroi Georgila

doi:10.18653/v1/w17-5539

Ramesh Manuvinakurike, David Devault + Show 1 more

Open Access

https://doi.org/10.18653/v1/w17-5539

Copy DOI

Abstract

We apply Reinforcement Learning (RL) to the problem of incremental dialogue policy learning in the context of a fast-paced dialogue game. We compare the policy learned by RL with a high-performance baseline policy which has been shown to perform very efficiently (nearly as well as humans) in this dialogue game. The RL policy outperforms the baseline policy in offline simulations (based on real user data). We provide a detailed comparison of the RL policy and the baseline policy, including information about how much effort and time it took to develop each one of them. We also highlight the cases where the RL policy performs better, and show that understanding the RL policy can provide valuable insights which can inform the creation of an even better rule-based policy.

Highlights

Building incremental spoken dialogue systems (SDSs) has recently attracted much attention
Our contributions are as follows: We provide an Reinforcement Learning (RL) method for incremental dialogue processing based on simplistic features which performs better in offline simulations than the high performance carefully designed rule (CDR) baseline
The policy learned using RL (LSPI with radial basis value function (RBF) functions) performs significantly better (p

Summary

Introduction

Building incremental spoken dialogue systems (SDSs) has recently attracted much attention. Our contributions are as follows: We provide an RL method for incremental dialogue processing based on simplistic features which performs better in offline simulations (based on real user data) than the high performance CDR baseline Note that this is a very strong baseline which has been shown to perform very efficiently (nearly as well as humans) in this dialogue game (Paetzel et al, 2015). The rule-based baselines used for comparing the RL policies against are not as carefully engineered as they could be, i.e., they are not the result of iterative improvement and optimization using insights learned from data or user testing This is understandable since building a very strong baseline would be a big project by itself and would detract attention from the RL problem. We highlight the cases where the RL policy performs better, and show that understanding the RL policy can provide valuable insights which can inform the creation of an even better rule-based policy

RDG-Image Game

Human-Human Data

Improving NLU with Agent Conversation Data

Room for Improvement

Design of the RL Policy

Experimental Setup

Results

Discussion & Future

Contrasting Baseline and RL Policy Building Efforts

Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Reinforcement Learning to Model Incrementality in a Fast-Paced Dialogue Game

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2017
Citations: 35	License type: cc-by

Similar Papers

Transfer Reinforcement Learning for Autonomous Driving
Aravind Balakrishnan ... Ashish Gaurav
ACM Transactions on Modeling and Computer Simulation | VOL. 31
Aravind Balakrishnan, et. al.Aravind Balakrishnan ... Ashish Gaurav
18 Jul 2021
ACM Transactions on Modeling and Computer Simulation | VOL. 31

Reinforcement Learning of Multi-Issue Negotiation Dialogue Policies
Alexandros Papangelis ... Kallirroi Georgila
-
Alexandros Papangelis, et. al.Alexandros Papangelis ... Kallirroi Georgila
01 Jan 2015
01 Jan 2015

Trustworthy safety improvement for autonomous driving using reinforcement learning
Zhong Cao ... Diange Yang
Transportation Research Part C: Emerging Technologies | VOL. 138
Zhong Cao, et. al.Zhong Cao ... Diange Yang
28 Mar 2022
Transportation Research Part C: Emerging Technologies | VOL. 138

RLOC: Terrain-Aware Legged Locomotion Using Reinforcement Learning and Optimal Control
Siddhant Gangapurwala ... Mathieu Geisert
IEEE Transactions on Robotics | VOL. 38
Siddhant Gangapurwala, et. al.Siddhant Gangapurwala ... Mathieu Geisert
01 Oct 2022
IEEE Transactions on Robotics | VOL. 38

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Reinforcement Learning to Model Incrementality in a Fast-Paced Dialogue Game

Abstract

Highlights

Summary

Talk to us

Similar Papers