Abstract

Managing multimodal interactions between humans and computer systems requires a combination of state estimation based on multiple observation streams, and optimisation of time-dependent action selection. Previous work using partially observable Markov decision processes (POMDPs) for multimodal interaction has focused on simple turn-based systems. However, state persistence and implicit state transitions are frequent in real-world multimodal interactions. These phenomena cannot be fully modelled using turn-based systems, where the timing of system actions is a non-trivial issue. In addition, in prior work the POMDP parameterisation has been either hand-coded or learned from labelled data, which requires significant domain-specific knowledge and is labor-consuming. We therefore propose a nonparametric Bayesian method to automatically infer the (distributional) representations of POMDP states for multimodal interactive systems, without using any domain knowledge. We develop an extended version of the infinite POMDP method, to better address state persistence, implicit transition, and timing issues observed in real data. The main contribution is a “sticky” infinite POMDP model that is biased towards self-transitions. The performance of the proposed unsupervised approach is evaluated based on both artificially synthesised data and a manually transcribed and annotated human-human interaction corpus. We show statistically significant improvements (e.g. in ability of the planner to recall human bartender actions) over a supervised POMDP method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.