A stochastic model of human-machine interaction for learning dialog strategies

E Levin,W Eckert,R Pieraccini

doi:10.1109/89.817450

Abstract

We propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user's behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A stochastic model of human-machine interaction for learning dialog strategies

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Speech and Audio Processing

Lead the way for us

Journal: IEEE Transactions on Speech and Audio Processing	Publication Date: Jan 1, 2000
Citations: 549

Similar Papers

Offline Reinforcement Learning for Price-Based Demand Response Program Design
Ce Xu ... Bo Liu
-
Ce Xu, et. al.Ce Xu ... Bo Liu
22 Mar 2023
22 Mar 2023

Optimization of Dry Weight Assessment in Hemodialysis Patients via Reinforcement Learning.
Ziyue Yang ... Tianshu Zhou
IEEE journal of biomedical and health informatics | VOL. 26
Ziyue Yang, et. al.Ziyue Yang ... Tianshu Zhou
01 Oct 2022
IEEE journal of biomedical and health informatics | VOL. 26

Stochastic maximum power point tracking of photovoltaic energy system under partial shading conditions
Bushra Iqbal ... Ali Faisal Murtaza
Protection and Control of Modern Power Systems | VOL. 6
Bushra Iqbal, et. al.Bushra Iqbal ... Ali Faisal Murtaza
22 Sep 2021
Protection and Control of Modern Power Systems | VOL. 6

Applying Markov decision process to understand driving decisions using basic safety messages data
Mohsen Kamrani ... Asad J Khattak
Transportation Research Part C: Emerging Technologies | VOL. 115
Mohsen Kamrani, et. al.Mohsen Kamrani ... Asad J Khattak
22 Apr 2020
Transportation Research Part C: Emerging Technologies | VOL. 115

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A stochastic model of human-machine interaction for learning dialog strategies

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Speech and Audio Processing