Weighted finite state transducer based statistical dialog management

Chiori Hori ,Kiyonori Ohtake ,Teruhisa Misu ,Hideki Kashioka ,Satoshi Nakamura

doi:10.1109/asru.2009.5373350

Abstract

We proposed a dialog system using a weighted finite-state transducer (WFST) in which user concept and system action tags are input and output of the transducer, respectively. The WFST-based platform for dialog management enables us to combine various statistical models for dialog management (DM), user input understanding and system action generation, and then search the best system action in response to user inputs among multiple hypotheses. To test the potential of the WFST-based DM platform using statistical models, we constructed a dialog system using a human-to-human spoken dialog corpus for hotel reservation, which is annotated with Interchange Format (IF). A scenario WFST and a spoken language understanding (SLU) WFST were obtained from the corpus and then composed together and optimized. We evaluated the detection accuracy of the system next action tags using Mean Reciprocal Ranking (MRR). Finally, we constructed a full WFST-based dialog system by composing SLU, scenario and sentence generation (SG) WFSTs. Humans read the system responses in natural language and judged the quality of the responses. We confirmed that the WFST-based DM platform was capable of handling various spoken language and scenarios when the user concept and system action tags are consistent and distinguishable.

Full Text