Abstract

We propose an architecture for integrating discourse processing and speech recognition (SR) in spoken dialogue systems. We apply it to a distributed battlefield simulation system used for military training. According to this architecture, discourse functions previously distributed through the interface code are collected into a centralized discourse capability. The Dialogue Manager (DM) acts as a third party mediator overseeing the translation of input and output utterances between English and the command language of the backend system. The DM calls the Discourse processor (DP) to update the context representation each time an utterance is issued or when a salient nonlinguistic event occurs. For task based human computer dialogue systems, the DM consults three sources of nonlinguistic context constraint in addition to the linguistic discourse state: (1) a user model, (2) a static domain model, and (3) a dynamic backend model (BEM). We describe its four step recovery algorithm invoked by DM whenever an item is unclear in the current context or there is an interpretation error, and show how parameter settings on the algorithm can modify the overall behavior of the system from tutor to trainer. This is offered to illustrate how limited (inexpensive) dialogue processing functionality, judiciously selected, and designed in conjunction with expectations for human dialogue behavior can compensate for inevitable limitations in SR, NL processor, the backend software application, or even in the user's understanding of the task or the software system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call