Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Gellert Weisz,Pei-Hao Su,Milica Gasic,Pawel Budzianowski

doi:10.1109/taslp.2018.2851664

Abstract

In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimization task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system. In this paper, we investigate deep reinforcement learning approaches to solve this problem. Particular attention is given to actor-critic methods, off-policy reinforcement learning with experience replay, and various methods aimed at reducing the bias and variance of estimators. When combined, these methods result in the previously proposed ACER algorithm that gave competitive results in gaming environments. These environments, however, are fully observable and have a relatively small action set so, in this paper, we examine the application of ACER to dialogue policy optimization. We show that this method beats the current state of the art in deep learning approaches for spoken dialogue systems. This not only leads to a more sample efficient algorithm that can train faster, but also allows us to apply the algorithm in more difficult environments than before. We thus experiment with learning in a very large action space, which has two orders of magnitude more actions than previously considered. We find that ACER trains significantly faster than the current state of the art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Nov 1, 2018
Citations: 101

Similar Papers

Experience Replay-based Deep Reinforcement Learning for Dialogue Management Optimisation
Shrikant Malviya ... Uma Shanker Tiwary
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -
Shrikant Malviya, et. al.Shrikant Malviya ... Uma Shanker Tiwary
25 May 2022
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. -

Intelligent Reflecting Surface-Aided Device-to-Device Communication: A Deep Reinforcement Learning Approach
Ajmery Sultana ... Xavier Fernando
Future Internet | VOL. 14
Ajmery Sultana, et. al.Ajmery Sultana ... Xavier Fernando
29 Aug 2022
Future Internet | VOL. 14

Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory
Junkyu Jang ... Nohyoon Seong
Expert Systems with Applications | VOL. 218
Junkyu Jang, et. al.Junkyu Jang ... Nohyoon Seong
13 Jan 2023
Expert Systems with Applications | VOL. 218

Conscientiousness and Neuroticism Predicted Learning Approaches of Medical Students
Jamilah Al-Muhammady Mohammad ... Muhamad Saiful Bahri Yusoff
Education in Medicine Journal | VOL. 14
Jamilah Al-Muhammady Mohammad, et. al.Jamilah Al-Muhammady Mohammad ... Muhamad Saiful Bahri Yusoff
27 Dec 2022
Education in Medicine Journal | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing