Abstract
Intelligent planning algorithms such as the Partially Observable Markov Decision Process (POMDP) have succeeded in dialog management applications [10, 11, 12] because they are robust to the inherent uncertainty of human interaction. Like all dialog planning systems, however, POMDPs require an accurate model of the user (e.g., what the user might say or want). POMDPs are generally specified using a large probabilistic model with many parameters. These parameters are difficult to specify from domain knowledge, and gathering enough data to estimate the parameters accurately a priori is expensive. In this paper, we take a Bayesian approach to learning the user model simultaneously with dialog manager policy. At the heart of our approach is an efficient incremental update algorithm that allows the dialog manager to replan just long enough to improve the current dialog policy given data from recent interactions. The update process has a relatively small computational cost, preventing long delays in the interaction. We are able to demonstrate a robust dialog manager that learns from interaction data, out-performing a hand-coded model in simulation and in a robotic wheelchair application.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.