Abstract

In spoken dialog systems, statistical state tracking aims to improve robustness to speech recognition errors by tracking a posterior distribution over hidden dialog states. This paper introduces two novel methods for this task. First, we explain how state tracking is structurally similar to web-style ranking, enabling mature, powerful ranking algorithms to be applied. Second, we show how to use multiple spoken language understanding engines (SLUs) in state tracking — multiple SLUs can expand the set of dialog states being tracked, and give more information about each, thereby increasing both recall and precision of state tracking. We evaluate on the second Dialog State Tracking Challenge; together these two techniques yield highest accuracy in 2 of 3 tasks, including the most difficult and general task.

Highlights

  • Spoken dialog systems interact with users via natural language to help them achieve a goal

  • Dialog state tracking is difficult because errors in automatic speech recognition (ASR) and spoken language understanding (SLU) are common, and can cause the system to misunderstand the user’s needs

  • For each slot the features encode 253 low-level quantities, such as: whether the slot value appears in this hypothesis; how many times the slot value has been observed; whether the slot value has been observed in this turn; functions of recognition metrics such as confidence score and position on N-best list; goal priors and confusion probabilities estimated on training data (Williams, 2012; Metallinou et al, 2013); results of confirmation attempts (“Italian food, is that right?”); output of the four rule-based baseline trackers; and the system act and its relation to the goal’s slot value

Read more

Summary

Introduction

Spoken dialog systems interact with users via natural language to help them achieve a goal. Each dialog state can be viewed as a document, and each dialog turn can be viewed as a search instance. The benefit of this construction is that it enables a rich literature of powerful ranking algorithms to be applied. Conjunctions are attractive in dialog state tracking where relationships exist between low-level concepts like grounding and confidence score. The second contribution is to incorporate the output of multiple spoken language understanding engines (SLUs) into dialog state tracking. Proceedings of the SIGDIAL 2014 Conference, pages 282–291, Philadelphia, U.S.A., 18-20 June 2014. c 2014 Association for Computational Linguistics cludes

Background
Preliminaries
User goal features
Evaluation metrics
Baselines
Web-style ranking
Multiple SLU engines
SLU Engines
Results with multiple SLU engines
Model averaging
Joint goal tracking summary
Fixed-size state components
Blind evaluation results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.