A Unified Stochastic Architecture for Spoken Dialogue Systems

Owen Lamont,Graham Mann

doi:10.1007/978-3-540-24581-0_77

Abstract

Spoken Dialogue Systems (SDS) have evolved over the last three decades from simple single word command, speech recognition applications and technologies to the large vocabulary continuous systems used today. SDSs have long been reliant on hierarchies of stochastic architectures, in particular Hidden Markov Models (HMM), for their components and sub-components. In this paper, we examine the applications of HMMs in speech recognition including phoneme recognition, word recognition and stochastic grammars. Other applications of HMMs within SDSs are also covered including word tagging and semantic classification at the parsing level, dialogue management and strategy optimisation and stochastic reply generation. We then propose that the Hierarchical Hidden Markov Model (HHMM) of Fine, Singer and Tishby serve as replacement for many of these specialised HMMs, creating a more unified and consistent architecture. We explore the feasibility of this model within a specific information retrieval SDS and demonstrate ways HMM merging can be used with contextual and entropic clustering to dynamically generate HHMM data structures. Issues of training time and applicability of the HHMMs to natural language processing are also examined.

Full Text